# Excel Filtering and Formatting Script The file in this repository filters has the intent to filter all written exams that require a 'large room' (>65 inschrijvingen) and thus need to be brought to the meeting which assigns large rooms to written exams on campus. The output has the layout in mind of the master file provided by E-campus but may need changes if the master file changes. ## Prerequisites Make sure you have the following software installed: - Python 3.x - Pip (Python package installer) ## Required Packages The script depends on the following Python packages: - `pandas` You can install the required package using pip: ```bash pip install pandas ``` ## Description The script performs the following operations: 1. Reads data from the specified Excel file and sheet. 2. Filters rows based on the value of the 'Examenvorm' column and the count of 'Aant. inschr.' column. 3. Selects specific columns from the filtered DataFrame. 4. Formats time strings in the columns 'Beginuur S+' and 'Einduur S+'. 5. Cleans the 'Docenten' column by removing specific keywords and trimming whitespace. 6. Writes the processed DataFrame to a new Excel file. ## Usage 1. Place your Excel file in the same directory as the script. 2. Update the `file_path` and `sheet_name` variables in the script with your specific file path and sheet name. 3. Run the script: ```bash python script.py ``` ## Code ```python import pandas as pd file_path = 'file.xlsx' sheet_name = 'ps (32)' # Read the data from the Excel file df = pd.read_excel(file_path, sheet_name=sheet_name) # Filter the data based on certain criteria filtered_df = df[df['Examenvorm'] == 'Schriftelijk'] filtered_df = filtered_df[filtered_df['Aant. inschr.'] > 65] filtered_df = filtered_df[['Datum S+', 'Beginuur S+', 'Einduur S+', 'Studiegidsnr.', 'Omschrijving', 'Docenten', 'Aant. inschr.']] # Format the time strings filtered_df['Beginuur S+'] = filtered_df['Beginuur S+'].apply(lambda x: x.strftime('%H:%M')) filtered_df['Einduur S+'] = filtered_df['Einduur S+'].apply(lambda x: x.strftime('%H:%M')) filtered_df['Docenten'] = filtered_df['Docenten'].str.replace(r'\b(Titularis|Co-Titularis|Medewerker)\b', '', regex=True).str.strip() # Save the filtered and formatted data to a new Excel file filtered_df.to_excel('filtered_grote_lokalen.xlsx', index=False) ``` ## Additional Notes - This script assumes that the input Excel file has specific columns like 'Examenvorm', 'Aant. inschr.', 'Datum S+', 'Beginuur S+', 'Einduur S+', 'Studiegidsnr.', 'Omschrijving', and 'Docenten'. - Make sure that the time columns ('Beginuur S+' and 'Einduur S+') are in datetime format in the original Excel file for the `.strftime('%H:%M')` method to work correctly. - The `Docenten` column will be cleaned by removing occurrences of the keywords 'Titularis', 'Co-Titularis', and 'Medewerker'. Feel free to adjust the script according to your specific needs. ## License This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.