python

How to Python's pickle module

January 18, 2024(January 18, 2024)

python, pickle

An overview of using Python’s pickle module for object serialization and deserialization, with file paths specified using pathlib. Object Serialization # Serialize(save) objects to a file using pickle. import pickle from pathlib import Path # Create an object to serialize my_data = {'key': 'value', 'number': 42} # Define the save path and serialize the object path = Path('data.pkl') with path.open('wb') as file: pickle.dump(my_data, file) Object Deserialization # Deserialize (load) objects from a file using pickle. ...

How to split a list into n parts in Python

December 7, 2023(December 8, 2023)

list, python

The split_list_into_n_chunks function in Python allows for dividing a given list into n equally sized sublists. If the list’s length is not perfectly divisible by n, it adjusts some sublist sizes slightly to achieve as even a division as possible. Function Definition # def split_list_into_n_chunks(original_list: list, split_num: int) -> list: chunk_size, remainder = divmod(len(original_list), split_num) chunks = [] start = 0 for _ in range(split_num): end = start + chunk_size + (1 if remainder > 0 else 0) chunks. ...

Generating Image for Confusion Matrix and Classification Report in Python

July 14, 2023(August 23, 2023)

python, metric, classification, confusion-matrix

In Machine Learning, a common task is to generate images for metrics such as the confusion matrix and the classification report, which are useful for evaluating model performance. Here, I will demonstrate how to generate and save these metrics as images using Python’s scikit-learn, matplotlib and seaborn. Confusion matrix # First, here’s the function for the confusion matrix: import matplotlib.figure import matplotlib.pyplot as plt import pandas as pd import seaborn as sns from sklearn. ...

How to handle jsonl in Python

January 4, 2023(September 12, 2023)

python, jsonl

Polars # Read # import polars as pl # Eager Evaluaiton data_df = pl.read_ndjson("file.jsonl") print(data_df.describe()) # Lazy Evaluaiton data_df = pl.scan_ndjson("file.jsonl") ## Need to evaluation before describe() when lazy evaluation data_df = data_df.fetch() print(data_df.describe()) Write # import polars as pl # sample data list[dict] data_list = [{"name": "alice", "age": "18"}, {"name": "bob", "age": "17"}] data_df = pl.DataFrame(data_list) data_df.write_ndjson("file.jsonl") Pandas # Read # import pandas as pd data_df = pd.read_json("file.jsonl", orient="records", lines=True) print(data_df. ...