Below are some of the key projects I've been working on:
This FastAPI application offers text prediction capabilities using a pre-trained RoBERTa model. It serves a static HTML page and provides an endpoint for generating text predictions in real-time.
The Fast + Focused App is a web-based application designed to enhance your reading speed by displaying words from a text file at a rapid pace. This method, often referred to as speed reading, allows users to consume content faster by minimizing the eye movement and focusing on central vision. The app dynamically reads a text file and displays each word individually at a user-defined speed.
Tasnif is a Python package designed for clustering images into user-defined classes based on their visual content. It utilizes deep learning to generate image embeddings, Principal Component Analysis (PCA) for dimensionality reduction, and K-means for clustering. Tasnif supports processing on both GPU and CPU, making it versatile for different computational environments.
from tasnif import Tasnif
# Initialize Tasnif with 5 classes, PCA dimensions set to 16, and GPU usage
classifier = Tasnif(num_classes=5, pca_dim=16, use_gpu=False)
# Read images from a specified directory
classifier.read('path/to/your/images')
# Calculate embeddings, PCA, and perform clustering
classifier.calculate()
# Export clustered images and grids
classifier.export('path/to/output')
This is a simple network scanner written in Python. It utilizes ARP requests to discover devices on a given network and performs a lightweight scan on each discovered device using Nmap.
A Python script designed to summarize webpages from specified URLs using the LangChain framework and the ChatOllama model. It leverages advanced language models to generate detailed summaries, making it an invaluable tool for quickly understanding the content of web-based documents.
python summarizer.py -u "http://example.com/document"
docker build -t web_summarizer .
docker run -p 7860:7860 web_summarizer
# Run if you run ollama on host
docker run -d --network='host' -p 7860:7860 web_summarizer
Easy image clustering tool.
usage: main.py [-h] [-i INPUT] [-c CLUSTER] [-p PCA]
Image caption CLI
optional arguments:
-h, --help show this help message and exit
-i INPUT, --input INPUT Input directory path, such as ./images
-c CLUSTER, --cluster CLUSTER How many cluster will be
-p PCA, --pca PCA PCA Dimensions
--cpu Run on CPU
Contact Sheet Generator is a Python script that generates a contact sheet from a directory of images. It uses the PIL library to process images and multiprocessing to generate thumbnails in parallel. The contact sheet is created by arranging the thumbnails in a grid pattern.
python contract_sheet.py /path/to/images --shuffle --heic_to jpeg --img-size 500 --no-crop result.jpg
A simple Python script for extracting audio embeddings.
This project provides a simple image similarity calculator using the CLIP (Contrastive Language-Image Pre-training) model. It consists of two Python scripts, predictor.py and app.py, that allow you to calculate the cosine similarity between two images.
The Audio Genre Detection project is a robust and sophisticated system for determining the genre of audio files. Leveraging the power of Essentia, a comprehensive library for audio analysis, and TensorFlow, this system provides accurate and efficient genre classification. It includes a Dockerized environment that streamlines the process of running the audio genre detection system, making it accessible and hassle-free.
Captioning is an img2txt model that uses the BLIP. Exports captions of images.
This project focuses on building and utilizing an autoencoder for clustering unlabeled image datasets. The autoencoder is designed to compress images into a lower-dimensional representation and then reconstruct them from this compressed form. The project includes training the autoencoder, extracting features from any image test dataset, and visualizing the embeddings using t-SNE to further reduce dimensionality for visualization.
This node introduces enhancements to the KSamplerAdvanced node by adding tiling functionality. The key changes are encapsulated in two new classes: KSamplerAdvancedTile and CircularVAEDecode. KSamplerAdvancedTile: This class brings in the capability to handle tiling along the X and Y axes independently. It includes methods for setting layer padding based on tiling parameters, applying asymmetric tiling to all convolutional layers, hijacking and restoring Conv2d methods for customized forward passes, and a tailored sampling method that accounts for tiling preferences, noise addition, and denoise levels. CircularVAEDecode: A class that extends VAEDecode, introducing circular padding for Conv2d layers during the decoding process.