data_processing
Data processing functions for chromatogram analysis
add_log_data ¶
add_log_data(
Integral_Frame: DataFrame,
Log: DataFrame,
columns: list[str] | all = "all",
) -> DataFrame
For a dataframe that contains a timestamp column, data from a log dataframe is added. The log dataframe must similarly contain a timestamp column. Args: Integral_Frame (pd.DataFrame): DataFrame containing e.g. chromatogram integrals. Log (pd.DataFrame): DataFrame containing log data with a timestamp column. columns (list[str] | 'all', optional): List of columns from the log to add. If 'all', all columns except timestamp are added. Defaults to 'all'.
Returns:
-
DataFrame–pd.DataFrame: DataFrame containing the original dataframe data with log data added.
Source code in src/chromstream/data_processing.py
get_temp_and_valves_MTO ¶
For a Dataframe containing chromatogram integrals and a timestamp column, add data from a log file.
Source code in src/chromstream/data_processing.py
integrate_channel ¶
integrate_channel(
chromatogram: ChannelChromatograms,
peaklist: dict,
column: None | str = None,
) -> DataFrame
Integrate the signal of a chromatogram over time.
Parameters:
-
chromatogram(ChannelChromatograms) –ChannelChromatograms object containing the chromatograms to be analyzed
-
peaklist(dict) –Dictionary defining the peaks to integrate. Example:
-
Peaks_TCD = {"N2"–[20, 26], "H2": [16, 19]}
-
column(None | str, default:None) –Optional column name to use for integration. If None, uses second column.
Returns: DataFrame with integrated peak areas for each injection
Source code in src/chromstream/data_processing.py
integrate_single_chromatogram ¶
integrate_single_chromatogram(
chromatogram: Chromatogram,
peaklist: dict,
column: None | str = None,
) -> dict
Integrate the signal of a single chromatogram over time.
Parameters:
-
chromatogram(Chromatogram) –Chromatogram object containing the data to be analyzed
-
peaklist(dict) –Dictionary defining the peaks to integrate. Example:
-
Peaks_TCD = {"N2"–[20, 26], "H2": [16, 19]}
-
column(None | str, default:None) –Optional column name to use for integration. If None, uses second column.
Returns:
-
dict–Dictionary with integrated peak areas and timestamp
Source code in src/chromstream/data_processing.py
linear_baseline ¶
Determines a linear baseline between the signal values at the two specified time points and subtracts it from the signal.
Parameters:
-
data(DataFrame) –DataFrame containing time and signal columns
-
start_time(float) –Time point to define the start of the baseline. Use the same unit as the chromatogram.
-
end_time(float) –Time point to define the end of the baseline. Use the same unit as the chromatogram.
Returns:
-
Series–Corrected signal as pandas Series
Source code in src/chromstream/data_processing.py
list_baseline_functions ¶
List available baseline functions.
Parameters:
-
verbose(bool, default:False) –If True, include each function docstring in the output.
Returns:
-
str–String with one baseline function per block.
Source code in src/chromstream/data_processing.py
min_subtract ¶
Simple minimum subtraction baseline correction
Parameters:
-
data(DataFrame) –DataFrame containing time and signal columns
Returns:
-
Series–Corrected signal as pandas Series
Source code in src/chromstream/data_processing.py
register_baseline ¶
register_baseline(
func: BaselineFunction,
) -> BaselineFunction
split_chromatogram ¶
split_chromatogram(
chromatogram: Chromatogram,
n_injections: int,
start_offset: int = 0,
end_offset: int = 0,
reset_time=True,
) -> list[Chromatogram]
When multiple injections are contained in a single chromatogram, this function splits the chromatogram into multiple chromatograms Important constraint is the the length of the chromatogram must be divisible by the number of injections. The injection time of each split chromatogram is adjusted based on the runtime. Note:
Parameters:
-
chromatogram(Chromatogram) –The chromatogram to be split.
-
n_injections(int) –The number of injections to split the chromatogram into.
-
start_offset(int, default:0) –Number of data points to skip at the start of the chromatogram. Defaults to 0.
-
end_offset(int, default:0) –Number of data points to skip at the end of the chromatogram. Defaults to 0.
-
reset_time(bool, default:True) –Whether to reset the time column to start from 0 for each split chromatogram. Defaults to True.
Returns:
-
list[Chromatogram]–list[Chromatogram]: A list of split chromatograms.
Source code in src/chromstream/data_processing.py
287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 | |
time_point_baseline ¶
time_point_baseline(
data: DataFrame, time_point: float
) -> Series
Use signal value at a specific time point as baseline
Parameters:
-
data(DataFrame) –DataFrame containing time and signal columns
-
time_point(float) –Time point to use as baseline reference. Use the same unit as the chromatogram.
Returns:
-
Series–Corrected signal as pandas Series
Source code in src/chromstream/data_processing.py
time_window_baseline ¶
Use mean of signal in a specific time window as baseline
Parameters:
-
data(DataFrame) –DataFrame containing time and signal columns
-
time_window(tuple[float, float], default:(0, 1)) –Tuple specifying the start and end time of the baseline window. Use the same unit as the chromatogram.
Returns:
-
Series–Corrected signal as pandas Series