ows-geschiedenis-mentoraats.../readme.md

6.2 KiB

Script Documentation

Overview

This script processes two Excel files (

reinoud.xlsx

and

sisa.xlsx

) to find and append missing IDs from

sisa.xlsx

to

reinoud.xlsx

. It also checks for duplicate IDs in

reinoud.xlsx

.

Functions

load_excel(file_path: str, sheet_name: Optional[str] = None) -> pd.DataFrame

Loads an Excel file into a DataFrame.

check_duplicates(df: pd.DataFrame, column: str) -> List[str]

Checks for duplicate values in a specified column.

find_missing_ids(df1: pd.DataFrame, df2: pd.DataFrame, column: str) -> List[str]

Finds IDs in df2 that are not in df1.

append_missing_ids(reinoud_df: pd.DataFrame, sisa_df: pd.DataFrame, column: str, reinoud_file: str) -> pd.DataFrame

Appends missing IDs and corresponding details from sisa_df to reinoud_df.

main(reinoud_file: str, sisa_file: str, column: str, reinoud_sheet: Optional[str] = None, sisa_sheet: Optional[str] = None)

Main function to load the Excel files, check for duplicates, append missing IDs, and save the updated DataFrame back to the Excel file.

Usage

Run the script with the following command:

python script.py

Example usage within the script:

if __name__ == "__main__":
    main('reinoud.xlsx', 'sisa.xlsx', 'Rolnummer', reinoud_sheet='Actief', sisa_sheet='sheet1')

Logging

The script uses the logging module to log information and errors. The log level is set to INFO.

File Structure

.gitignore
reinoud.xlsx
script.py
sisa.xlsx

Dependencies

  • pandas
  • logging

Install dependencies using:

pip install pandas

License

This script is provided "as-is" without any warranty. Use at your own risk.