LEARN FROM SCRATCH SIGNAL AND IMAGE PROCESSING WITH PYTHON GUI


Book Description

In this book, you will learn how to use OpenCV, NumPy library and other libraries to perform signal processing, image processing, object detection, and feature extraction with Python GUI (PyQt). You will learn how to filter signals, detect edges and segments, and denoise images with PyQt. You will also learn how to detect objects (face, eye, and mouth) using Haar Cascades and how to detect features on images using Harris Corner Detection, Shi-Tomasi Corner Detector, Scale-Invariant Feature Transform (SIFT), and Features from Accelerated Segment Test (FAST). In Chapter 1, you will learn: Tutorial Steps To Create A Simple GUI Application, Tutorial Steps to Use Radio Button, Tutorial Steps to Group Radio Buttons, Tutorial Steps to Use CheckBox Widget, Tutorial Steps to Use Two CheckBox Groups, Tutorial Steps to Understand Signals and Slots, Tutorial Steps to Convert Data Types, Tutorial Steps to Use Spin Box Widget, Tutorial Steps to Use ScrollBar and Slider, Tutorial Steps to Use List Widget, Tutorial Steps to Select Multiple List Items in One List Widget and Display It in Another List Widget, Tutorial Steps to Insert Item into List Widget, Tutorial Steps to Use Operations on Widget List, Tutorial Steps to Use Combo Box, Tutorial Steps to Use Calendar Widget and Date Edit, and Tutorial Steps to Use Table Widget. In Chapter 2, you will learn: Tutorial Steps To Create A Simple Line Graph, Tutorial Steps To Create A Simple Line Graph in Python GUI, Tutorial Steps To Create A Simple Line Graph in Python GUI: Part 2, Tutorial Steps To Create Two or More Graphs in the Same Axis, Tutorial Steps To Create Two Axes in One Canvas, Tutorial Steps To Use Two Widgets, Tutorial Steps To Use Two Widgets, Each of Which Has Two Axes, Tutorial Steps To Use Axes With Certain Opacity Levels, Tutorial Steps To Choose Line Color From Combo Box, Tutorial Steps To Calculate Fast Fourier Transform, Tutorial Steps To Create GUI For FFT, Tutorial Steps To Create GUI For FFT With Some Other Input Signals, Tutorial Steps To Create GUI For Noisy Signal, Tutorial Steps To Create GUI For Noisy Signal Filtering, and Tutorial Steps To Create GUI For Wav Signal Filtering. In Chapter 3, you will learn: Tutorial Steps To Convert RGB Image Into Grayscale, Tutorial Steps To Convert RGB Image Into YUV Image, Tutorial Steps To Convert RGB Image Into HSV Image, Tutorial Steps To Filter Image, Tutorial Steps To Display Image Histogram, Tutorial Steps To Display Filtered Image Histogram, Tutorial Steps To Filter Image With CheckBoxes, Tutorial Steps To Implement Image Thresholding, and Tutorial Steps To Implement Adaptive Image Thresholding. In Chapter 4, you will learn: Tutorial Steps To Generate And Display Noisy Image, Tutorial Steps To Implement Edge Detection On Image, Tutorial Steps To Implement Image Segmentation Using Multiple Thresholding and K-Means Algorithm, and Tutorial Steps To Implement Image Denoising. In Chapter 5, you will learn: Tutorial Steps To Detect Face, Eye, and Mouth Using Haar Cascades, Tutorial Steps To Detect Face Using Haar Cascades with PyQt, Tutorial Steps To Detect Eye, and Mouth Using Haar Cascades with PyQt, and Tutorial Steps To Extract Detected Objects. In Chapter 6, you will learn: Tutorial Steps To Detect Image Features Using Harris Corner Detection, Tutorial Steps To Detect Image Features Using Shi-Tomasi Corner Detection, Tutorial Steps To Detect Features Using Scale-Invariant Feature Transform (SIFT), and Tutorial Steps To Detect Features Using Features from Accelerated Segment Test (FAST). You can download the XML files from https://viviansiahaan.blogspot.com/2023/06/learn-from-scratch-signal-and-image.html.




START FROM SCRATCH DIGITAL IMAGE PROCESSING WITH TKINTER


Book Description

"Start from Scratch: Digital Image Processing with Tkinter" is a beginner-friendly guide that delves into the basics of digital image processing using Python and Tkinter, a popular GUI library. The project is divided into distinct modules, each focusing on a specific aspect of image manipulation. The journey begins with an exploration of Image Color Space. Here, readers encounter the Main Form, which serves as the entry point to the application. It provides a user-friendly interface for loading images, selecting color spaces, and visualizing various color channels. The Fundamental Utilities play a crucial role by providing core functionalities like loading images, converting color spaces, and manipulating pixel data. The project also includes forms dedicated to displaying individual color channels and offering insights into the current color space through histograms. The Plotting Utilities module facilitates the creation of visual representations such as plots and graphs, enhancing the user's understanding of color spaces. Moving on, the Image Transformation section introduces readers to techniques like the Fast Fourier Transform (FFT). The Fast Fourier Transform Utilities module enables the implementation of FFT algorithms for converting images from spatial to frequency domains. A corresponding form allows users to view images in the frequency domain, with additional adjustments made to the plotting utilities for effective visualization. In the context of Discrete Cosine Transform (DCT), readers gain insights into algorithms and functions for transforming images. The Form for Discrete Cosine Transform aids in visualizing images in the DCT domain, while the plotting utilities are modified to accommodate these transformed images. The Discrete Sine Transform (DST) section introduces readers to DST algorithms and their role in image transformation. A dedicated form for visualizing images in the DST domain is provided, and the plotting utilities are further extended to handle these transformations effectively. Moving Average Smoothing is another critical aspect covered in the project. The Filter2D Utilities facilitate the application of moving average smoothing techniques. Additionally, metrics utilities enable the assessment of the smoothing process, with forms available for displaying both metrics and the smoothed images. Next, the project addresses Exponential Moving Average techniques, modifying the existing utilities to accommodate this specific approach. Similarly, forms for visualizing results and metrics are provided. Readers are then introduced to techniques like Median Filtering, Savitzky-Golay Filtering, and Wiener Filtering. The Filter2D Utilities are adapted to facilitate these filtering methods, and metrics utilities are employed to assess the effectiveness of each technique. Forms dedicated to each filtering method provide a platform for visualizing the results. The final section of the project explores techniques such as Total Variation Denoising, Non-Local Means Denoising, and PCA Denoising. The Filter2D Utilities are once again modified to support these denoising techniques. Metrics utilities are employed to evaluate the denoising process, and dedicated forms offer visualization capabilities. By breaking down the project into these modules, readers can systematically grasp the fundamentals of digital image processing, gradually building their skills from one concept to the next. Each section provides hands-on experience and practical knowledge, making it an ideal starting point for beginners in image processing.




Python GUI For Signal and Image Processing


Book Description

You will learn to create GUI applications using the Qt toolkit. The Qt toolkit, also popularly known as Qt, is a cross-platform application and UI framework developed by Trolltech, which is used to develop GUI applications. You will develop an existing GUI by adding several Line Edit widgets to read input, which are used to set the range and step of the graph (signal). Next, Now, you can use a widget for each graph. Add another Widget from Containers in gui_graphics.ui using Qt Designer. Then, Now, you can use two Widgets, each of which has two canvases. The two canvases has QVBoxLayout in each Widget. Finally, you will apply those Widgets to display the results of signal and image processing techniques.




START FROM SCRATCH DIGITAL SIGNAL PROCESSING WITH TKINTER


Book Description

In this project, you will create a multi-form GUI to implement digital signal processing. Creating a GUI involves designing an interface where users can input parameters and visualize the results of various signal processing techniques. Each form corresponds to a specific technique and is implemented using the tkinter library. The "Simple Sinusoidal Form" allows users to generate and visualize a basic sinusoidal signal. It includes input fields for parameters like frequency, amplitude, and time period. The utilities associated with this form provide functions to generate and plot the simple sinusoidal signal. The "Two Sinusoidals Form" extends the previous form, enabling users to generate and visualize two combined sinusoidal signals. It provides input fields for frequencies, amplitudes, and time periods of both signals. The utilities handle the generation and plotting of the combined sinusoidal signals. The "More Two Sinusoidals Form" further extends the previous form to generate and visualize additional combined sinusoidal signals. It includes input fields for frequencies, amplitudes, and time periods of three sinusoidal signals. The utilities handle the generation and plotting of these combined signals. Forms for various modulation techniques (AM, FM, PM, ASK, FSK, PSK) are available. These allow users to generate and visualize modulated signals by providing input fields for modulation indices, carrier frequencies, and time periods. The utilities in each form handle the signal generation and modulation process, as well as the plotting of the modulated signals. Forms for different filter designs (FIR, Butterworth, Chebyshev Type 1) cover lowpass, highpass, bandpass, and bandstop filters. They include input fields for filter order, cutoff frequencies, and other relevant parameters. The utilities in each form implement the filter design and frequency response plotting. Wavelet transformation forms focus on wavelet-based techniques, including scaling, decomposition, and denoising. They provide input fields for wavelet type, thresholding methods, and other wavelet-specific parameters. The utilities handle the wavelet transformations, denoising, and visualizing the results. Forms for various denoising techniques (MA, EMA, Median, SGF, Wiener, TV, NLM, PCA) cover different smoothing and denoising methods. They offer input fields for relevant denoising parameters. The utilities for each form implement the denoising process and display the denoised signals. Each form's utility methods interact with the GUI elements, taking user inputs and performing the corresponding signal processing tasks. These utilities encapsulate the underlying algorithms and ensure a seamless interaction between the user interface and the backend computations. In summary, this session involves creating a comprehensive GUI for a wide range of signal processing techniques, including signal generation, modulation, filtering, wavelet transformations, and various denoising methods. Each form and its associated utilities handle specific tasks, ensuring an intuitive and effective user experience.




PYTHON GUI PROJECTS WITH MACHINE LEARNING AND DEEP LEARNING


Book Description

PROJECT 1: THE APPLIED DATA SCIENCE WORKSHOP: Prostate Cancer Classification and Recognition Using Machine Learning and Deep Learning with Python GUI Prostate cancer is cancer that occurs in the prostate. The prostate is a small walnut-shaped gland in males that produces the seminal fluid that nourishes and transports sperm. Prostate cancer is one of the most common types of cancer. Many prostate cancers grow slowly and are confined to the prostate gland, where they may not cause serious harm. However, while some types of prostate cancer grow slowly and may need minimal or even no treatment, other types are aggressive and can spread quickly. The dataset used in this project consists of 100 patients which can be used to implement the machine learning and deep learning algorithms. The dataset consists of 100 observations and 10 variables (out of which 8 numeric variables and one categorical variable and is ID) which are as follows: Id, Radius, Texture, Perimeter, Area, Smoothness, Compactness, Diagnosis Result, Symmetry, and Fractal Dimension. The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, XGB classifier, MLP classifier, and CNN 1D. Finally, you will develop a GUI using PyQt5 to plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 2: THE APPLIED DATA SCIENCE WORKSHOP: Urinary Biomarkers Based Pancreatic Cancer Classification and Prediction Using Machine Learning with Python GUI Pancreatic cancer is an extremely deadly type of cancer. Once diagnosed, the five-year survival rate is less than 10%. However, if pancreatic cancer is caught early, the odds of surviving are much better. Unfortunately, many cases of pancreatic cancer show no symptoms until the cancer has spread throughout the body. A diagnostic test to identify people with pancreatic cancer could be enormously helpful. In a paper by Silvana Debernardi and colleagues, published this year in the journal PLOS Medicine, a multi-national team of researchers sought to develop an accurate diagnostic test for the most common type of pancreatic cancer, called pancreatic ductal adenocarcinoma or PDAC. They gathered a series of biomarkers from the urine of three groups of patients: Healthy controls, Patients with non-cancerous pancreatic conditions, like chronic pancreatitis, and Patients with pancreatic ductal adenocarcinoma. When possible, these patients were age- and sex-matched. The goal was to develop an accurate way to identify patients with pancreatic cancer. The key features are four urinary biomarkers: creatinine, LYVE1, REG1B, and TFF1. Creatinine is a protein that is often used as an indicator of kidney function. YVLE1 is lymphatic vessel endothelial hyaluronan receptor 1, a protein that may play a role in tumor metastasis. REG1B is a protein that may be associated with pancreas regeneration. TFF1 is trefoil factor 1, which may be related to regeneration and repair of the urinary tract. The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, XGB classifier, and MLP classifier. Finally, you will develop a GUI using PyQt5 to plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 3: DATA SCIENCE CRASH COURSE: Voice Based Gender Classification and Prediction Using Machine Learning and Deep Learning with Python GUI This dataset was created to identify a voice as male or female, based upon acoustic properties of the voice and speech. The dataset consists of 3,168 recorded voice samples, collected from male and female speakers. The voice samples are pre-processed by acoustic analysis in R using the seewave and tuneR packages, with an analyzed frequency range of 0hz-280hz (human vocal range). The following acoustic properties of each voice are measured and included within the CSV: meanfreq: mean frequency (in kHz); sd: standard deviation of frequency; median: median frequency (in kHz); Q25: first quantile (in kHz); Q75: third quantile (in kHz); IQR: interquantile range (in kHz); skew: skewness; kurt: kurtosis; sp.ent: spectral entropy; sfm: spectral flatness; mode: mode frequency; centroid: frequency centroid (see specprop); peakf: peak frequency (frequency with highest energy); meanfun: average of fundamental frequency measured across acoustic signal; minfun: minimum fundamental frequency measured across acoustic signal; maxfun: maximum fundamental frequency measured across acoustic signal; meandom: average of dominant frequency measured across acoustic signal; mindom: minimum of dominant frequency measured across acoustic signal; maxdom: maximum of dominant frequency measured across acoustic signal; dfrange: range of dominant frequency measured across acoustic signal; modindx: modulation index. Calculated as the accumulated absolute difference between adjacent measurements of fundamental frequencies divided by the frequency range; and label: male or female. The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, XGB classifier, MLP classifier, and CNN 1D. Finally, you will develop a GUI using PyQt5 to plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 4: DATA SCIENCE CRASH COURSE: Thyroid Disease Classification and Prediction Using Machine Learning and Deep Learning with Python GUI Thyroid disease is a general term for a medical condition that keeps your thyroid from making the right amount of hormones. Thyroid typically makes hormones that keep body functioning normally. When the thyroid makes too much thyroid hormone, body uses energy too quickly. The two main types of thyroid disease are hypothyroidism and hyperthyroidism. Both conditions can be caused by other diseases that impact the way the thyroid gland works. Dataset used in this project was from Garavan Institute Documentation as given by Ross Quinlan 6 databases from the Garavan Institute in Sydney, Australia. Approximately the following for each database: 2800 training (data) instances and 972 test instances. This dataset contains plenty of missing data, while 29 or so attributes, either Boolean or continuously-valued. The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, XGB classifier, MLP classifier, and CNN 1D. Finally, you will develop a GUI using PyQt5 to plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy.




ZERO TO MASTERY: THE COMPLETE GUIDE TO LEARNING SQLITE AND PYTHON GUI


Book Description

In this project, we provide you with the SQLite version of The Oracle Database Sample Schemas that provides a common platform for examples in each release of the Oracle Database. The sample database is also a good database for practicing with SQL, especially SQLite. The detailed description of the database can be found on: http://luna-ext.di.fc.ul.pt/oracle11g/server.112/e10831/diagrams.htm#insertedID0. The four schemas are a set of interlinked schemas. This set of schemas provides a layered approach to complexity: A simple schema Human Resources (HR) is useful for introducing basic topics. An extension to this schema supports Oracle Internet Directory demos; A second schema, Order Entry (OE), is useful for dealing with matters of intermediate complexity. Many data types are available in this schema, including non-scalar data types; The Online Catalog (OC) subschema is a collection of object-relational database objects built inside the OE schema; The Product Media (PM) schema is dedicated to multimedia data types; The Sales History (SH) schema is designed to allow for demos with large amounts of data. An extension to this schema provides support for advanced analytic processing. The HR schema consists of seven tables: regions, countries, locations, departments, employees, jobs, and job_histories. This book only implements HR schema, since the other schemas will be implemented in the next books.




The Best Way to Learn Java GUI with MySQL, MariaDB, and PostgreSQL


Book Description

In this book, you will create three Java GUI applications using MySQL, MariaDB, and PostgreSQL. In this book, you will learn how to build from scratch a database management system using Java. In designing a GUI and as an IDE, you will make use of the NetBeans tool. Gradually and step by step, you will be taught how to utilize three different databases in Java. In chapter one, you will create School database and its six tables. In chapter two, you will study: Creating the initial three table projects in the school database: Teacher table, TClass table, and Subject table; Creating database configuration files; Creating a Java GUI for viewing and navigating the contents of each table; Creating a Java GUI for inserting and editing tables; and Creating a Java GUI to join and query the three tables. In chapter three, you will learn: Creating the main form to connect all forms; Creating a project will add three more tables to the school database: the Student table, the Parent table, and Tuition table; Creating a Java GUI to view and navigate the contents of each table; Creating a Java GUI for editing, inserting, and deleting records in each table; Creating a Java GUI to join and query the three tables and all six. In chapter four, you will study how to query the six tables. In chapter five, you will learn the basics of cryptography using Java. Here, you will learn how to write a Java program to count Hash, MAC (Message Authentication Code), store keys in a KeyStore, generate PrivateKey and PublicKey, encrypt / decrypt data, and generate and verify digital prints. In chapter six, you will create Bank database and its tables. In chapter seven, you will learn how to create and store salt passwords and verify them. You will create a Login table. In this case, you will see how to create a Java GUI using NetBeans to implement it. In addition to the Login table, in this chapter you will also create a Client table. In the case of the Client table, you will learn how to generate and save public and private keys into a database. You will also learn how to encrypt / decrypt data and save the results into a database. In chapter eight, you will create an Account table. This account table has the following ten fields: account_id (primary key), client_id (primarykey), account_number, account_date, account_type, plain_balance, cipher_balance, decipher_balance, digital_signature, and signature_verification. In this case, you will learn how to implement generating and verifying digital prints and storing the results into a database. In chapter nine, you will create a Client_Data table, which has the following seven fields: client_data_id (primary key), account_id (primary_key), birth_date, address, mother_name, telephone, and photo_path. In chapter ten, you will be taught how to create Crime database and its tables. In chapter eleven, you will be taught how to extract image features, utilizing BufferedImage class, in Java GUI. In chapter twelve, you will be taught to create Java GUI to view, edit, insert, and delete Suspect table data. This table has eleven columns: suspect_id (primary key), suspect_name, birth_date, case_date, report_date, suspect_ status, arrest_date, mother_name, address, telephone, and photo. In chapter thirteen, you will be taught to create Java GUI to view, edit, insert, and delete Feature_Extraction table data. This table has eight columns: feature_id (primary key), suspect_id (foreign key), feature1, feature2, feature3, feature4, feature5, and feature6. In chapter fourteen, you will add two tables: Police_Station and Investigator. These two tables will later be joined to Suspect table through another table, File_Case. The Police_Station has six columns: police_station_id (primary key), location, city, province, telephone, and photo. The Investigator has eight columns: investigator_id (primary key), investigator_name, rank, birth_date, gender, address, telephone, and photo. Here, you will design a Java GUI to display, edit, fill, and delete data in both tables. In chapter fifteen, you will add two tables: Victim and File_Case. The File_Case table will connect four other tables: Suspect, Police_Station, Investigator and Victim. The Victim table has nine columns: victim_id (primary key), victim_name, crime_type, birth_date, crime_date, gender, address, telephone, and photo. The File_Case has seven columns: file_case_id (primary key), suspect_id (foreign key), police_station_id (foreign key), investigator_id (foreign key), victim_id (foreign key), status, and description. Here, you will also design a Java GUI to display, edit, fill, and delete data in both tables.




DIGITAL SIGNATURE ALGORITHM: LEARN BY EXAMPLES WITH PYTHON AND TKINTER


Book Description

Project 1 demonstrates generating a DSA (Digital Signature Algorithm) key pair using the cryptography library, where a 2048-bit private key is created and a corresponding public key is derived. The private key is essential for securely signing digital messages, and the public key allows others to verify these signatures. Both keys are serialized into PEM format, making them suitable for storage or transmission. The private key is serialized without encryption (though encryption is optional), while the public key is also serialized for easy sharing and use in cryptographic operations. Project 2 is a DSA (Digital Signature Algorithm) Key Generator application built with Python's tkinter for the GUI and the cryptography library for key generation. It provides an intuitive interface to generate, view, and save 2048-bit DSA key pairs, essential for secure digital signatures. The GUI features two tabs: "Generate Keys" for creating and serializing keys into PEM format, and "View Keys" for displaying them. Users can save the keys as .pem files with ease, supported by robust error handling and success notifications, making the application accessible and practical for secure communication needs. Project 3 demonstrates the process of signing and verifying a message using the Digital Signature Algorithm (DSA) in Python, while ensuring the signature is UTF-8 safe by encoding it in Base64. It begins by generating a DSA private and public key pair with a key size of 2048 bits. A message (in bytes) is then created, which is the data to be signed. The private key is used to generate a digital signature for the message using the SHA-256 hashing algorithm, ensuring the integrity and authenticity of the message. The generated signature, which is binary data, is encoded into Base64 format to make it text-safe and suitable for UTF-8 encoding. To verify the signature, the Base64-encoded signature is first decoded back into its original binary form. The public key is then used to verify the authenticity of the signature by comparing it to the message. If the verification is successful, the message "Signature is valid." is printed; otherwise, an InvalidSignature exception is raised, and the message "Signature is invalid." is displayed. This approach ensures that the digital signature can be safely transmitted or stored as text without data corruption, while still preserving its security properties. Project 4 is a Tkinter-based GUI application for Digital Signature Algorithm (DSA) operations, offering an intuitive interface for generating DSA keys, signing messages, and verifying signatures. It has two main tabs: one for generating and displaying DSA key pairs in PEM format, and another for signing and verifying messages. Users can input a message, sign it with the private key, and view the Base64-encoded signature, or verify a signature against the original message using the public key. The application handles errors gracefully, providing feedback on operations, making it a practical tool for cryptographic tasks. Project 5 and 6 provides a complete implementation for generating, signing, and verifying files using the Digital Signature Algorithm (DSA). It includes functions for creating DSA key pairs, signing file contents, and verifying signatures. The generate_and_save_keys() function generates a private and public key, serializes them to PEM format, and saves them to files. The sign_file() function uses the private key to sign the SHA-256 hash of a file's content, saving the signature in Base64 format. The verify_file_signature() function then verifies this signature using the public key, ensuring the file's authenticity and integrity. The project is designed as a user-friendly Tkinter-based GUI application, with three main functionalities: key generation, file signing, and signature verification. Users can generate DSA key pairs in the "Generate Keys" tab, sign files in the "Sign File" tab, and verify signatures in the "Verify Signature" tab. By providing an intuitive interface, this application enables users to efficiently manage cryptographic operations, ensuring data security and authenticity without needing to understand low-level cryptographic details. Project 7 and 8 focuses on creating and securing synthetic financial datasets to ensure data integrity. It combines data generation, digital signing, and signature verification to authenticate and protect financial records. The primary goals are to generate realistic financial data, secure it with digital signatures, and verify these signatures to detect tampering or corruption. The project involves generating a synthetic dataset with multiple columns such as transaction IDs, account numbers, amounts, currencies, timestamps, and transaction types. DSA keys are then generated for signing and verification, with the private key used for signing each entry in the dataset. These signatures are saved separately, allowing verification using the public key. This process ensures that any unauthorized changes to the data are detected, demonstrating a secure approach to data handling in financial applications. Project 9 and 10 combines the Digital Signature Algorithm (DSA) with Least Significant Bit (LSB) steganography to securely hide a signed message within an image. First, DSA keys are generated and used to sign a message, ensuring its authenticity and integrity. The signed message is then embedded into an image using LSB steganography, where the least significant bits of the image pixels' red channel are altered to include the binary representation of the message and its signature. To extract and verify the hidden data, the code retrieves the embedded bits from the image and reconstructs the original message. It then uses the public DSA key to verify the signature, confirming the message's authenticity. This integration of cryptographic signing with steganography provides a secure method to conceal and authenticate sensitive information within an image file. Project 11 and 12 provides a workflow for encrypting and hiding data using RSA and DSA cryptographic algorithms, along with steganography. It begins with generating RSA and DSA keys, then encrypts a message using RSA and signs it with a DSA private key, ensuring confidentiality and authenticity. The encrypted and signed data is embedded into an image using Least Significant Bit (LSB) steganography, altering the pixel values to include the hidden information. The process continues by extracting the hidden data from the image, verifying its integrity using the DSA signature, and decrypting the message with the RSA private key. This approach demonstrates a secure method of combining encryption, digital signatures, and steganography to protect and authenticate sensitive data, making it a robust solution for secure data transmission.




TKINTER, DATA SCIENCE, AND MACHINE LEARNING


Book Description

In this project, we embarked on a comprehensive journey through the world of machine learning and model evaluation. Our primary goal was to develop a Tkinter GUI and assess various machine learning models on a given dataset to identify the best-performing one. This process is essential in solving real-world problems, as it helps us select the most suitable algorithm for a specific task. By crafting this Tkinter-powered GUI, we provided an accessible and user-friendly interface for users engaging with machine learning models. It simplified intricate processes, allowing users to load data, select models, initiate training, and visualize results without necessitating code expertise or command-line operations. This GUI introduced a higher degree of usability and accessibility to the machine learning workflow, accommodating users with diverse levels of technical proficiency. We began by loading and preprocessing the dataset, a fundamental step in any machine learning project. Proper data preprocessing involves tasks such as handling missing values, encoding categorical features, and scaling numerical attributes. These operations ensure that the data is in a format suitable for training and testing machine learning models. Once our data was ready, we moved on to the model selection phase. We evaluated multiple machine learning algorithms, each with its strengths and weaknesses. The models we explored included Logistic Regression, Random Forest, K-Nearest Neighbors (KNN), Decision Trees, Gradient Boosting, Extreme Gradient Boosting (XGBoost), Multi-Layer Perceptron (MLP), and Support Vector Classifier (SVC). For each model, we employed a systematic approach to find the best hyperparameters using grid search with cross-validation. This technique allowed us to explore different combinations of hyperparameters and select the configuration that yielded the highest accuracy on the training data. These hyperparameters included settings like the number of estimators, learning rate, and kernel function, depending on the specific model. After obtaining the best hyperparameters for each model, we trained them on our preprocessed dataset. This training process involved using the training data to teach the model to make predictions on new, unseen examples. Once trained, the models were ready for evaluation. We assessed the performance of each model using a set of well-established evaluation metrics. These metrics included accuracy, precision, recall, and F1-score. Accuracy measured the overall correctness of predictions, while precision quantified the proportion of true positive predictions out of all positive predictions. Recall, on the other hand, represented the proportion of true positive predictions out of all actual positives, highlighting a model's ability to identify positive cases. The F1-score combined precision and recall into a single metric, helping us gauge the overall balance between these two aspects. To visualize the model's performance, we created key graphical representations. These included confusion matrices, which showed the number of true positive, true negative, false positive, and false negative predictions, aiding in understanding the model's classification results. Additionally, we generated Receiver Operating Characteristic (ROC) curves and area under the curve (AUC) scores, which depicted a model's ability to distinguish between classes. High AUC values indicated excellent model performance. Furthermore, we constructed true values versus predicted values diagrams to provide insights into how well our models aligned with the actual data distribution. Learning curves were also generated to observe a model's performance as a function of training data size, helping us assess whether the model was overfitting or underfitting. Lastly, we presented the results in a clear and organized manner, saving them to Excel files for easy reference. This allowed us to compare the performance of different models and make an informed choice about which one to select for our specific task. In summary, this project was a comprehensive exploration of the machine learning model development and evaluation process. We prepared the data, selected and fine-tuned various models, assessed their performance using multiple metrics and visualizations, and ultimately arrived at a well-informed decision about the most suitable model for our dataset. This approach serves as a valuable blueprint for tackling real-world machine learning challenges effectively.




HIGHER EDUCATION STUDENT ACADEMIC PERFORMANCE ANALYSIS AND PREDICTION USING MACHINE LEARNING WITH PYTHON GUI


Book Description

The dataset used in this project was collected from the Faculty of Engineering and Faculty of Educational Sciences students in 2019. The purpose is to predict students' end-of-term performances using ML techniques. Attribute information in the dataset are as follows: Student ID; Student Age (1: 18-21, 2: 22-25, 3: above 26); Sex (1: female, 2: male); Graduated high-school type: (1: private, 2: state, 3: other); Scholarship type: (1: None, 2: 25%, 3: 50%, 4: 75%, 5: Full); Additional work: (1: Yes, 2: No); Regular artistic or sports activity: (1: Yes, 2: No); Do you have a partner: (1: Yes, 2: No); Total salary if available (1: USD 135-200, 2: USD 201-270, 3: USD 271-340, 4: USD 341-410, 5: above 410); Transportation to the university: (1: Bus, 2: Private car/taxi, 3: bicycle, 4: Other); Accommodation type in Cyprus: (1: rental, 2: dormitory, 3: with family, 4: Other); Mother's education: (1: primary school, 2: secondary school, 3: high school, 4: university, 5: MSc., 6: Ph.D.); Father's education: (1: primary school, 2: secondary school, 3: high school, 4: university, 5: MSc., 6: Ph.D.); Number of sisters/brothers (if available): (1: 1, 2:, 2, 3: 3, 4: 4, 5: 5 or above); Parental status: (1: married, 2: divorced, 3: died - one of them or both); Mother's occupation: (1: retired, 2: housewife, 3: government officer, 4: private sector employee, 5: self-employment, 6: other); Father's occupation: (1: retired, 2: government officer, 3: private sector employee, 4: self-employment, 5: other); Weekly study hours: (1: None, 2: <5 hours, 3: 6-10 hours, 4: 11-20 hours, 5: more than 20 hours); Reading frequency (non-scientific books/journals): (1: None, 2: Sometimes, 3: Often); Reading frequency (scientific books/journals): (1: None, 2: Sometimes, 3: Often); Attendance to the seminars/conferences related to the department: (1: Yes, 2: No); Impact of your projects/activities on your success: (1: positive, 2: negative, 3: neutral); Attendance to classes (1: always, 2: sometimes, 3: never); Preparation to midterm exams 1: (1: alone, 2: with friends, 3: not applicable); Preparation to midterm exams 2: (1: closest date to the exam, 2: regularly during the semester, 3: never); Taking notes in classes: (1: never, 2: sometimes, 3: always); Listening in classes: (1: never, 2: sometimes, 3: always); Discussion improves my interest and success in the course: (1: never, 2: sometimes, 3: always); Flip-classroom: (1: not useful, 2: useful, 3: not applicable); Cumulative grade point average in the last semester (/4.00): (1: <2.00, 2: 2.00-2.49, 3: 2.50-2.99, 4: 3.00-3.49, 5: above 3.49); Expected Cumulative grade point average in the graduation (/4.00): (1: <2.00, 2: 2.00-2.49, 3: 2.50-2.99, 4: 3.00-3.49, 5: above 3.49); Course ID; and OUTPUT: Grade (0: Fail, 1: DD, 2: DC, 3: CC, 4: CB, 5: BB, 6: BA, 7: AA). The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, and XGB classifier. Three feature scaling used in machine learning are raw, minmax scaler, and standard scaler. Finally, you will develop a GUI using PyQt5 to plot cross validation score, predicted values versus true values, confusion matrix, learning curve, decision boundaries, performance of the model, scalability of the model, training loss, and training accuracy.