Projects

Projects and Publications

My research journey spans NLP, machine learning, and privacy, tackling challenges from video summarization to differential privacy relaxation. I am developing a multi-source academic summarizer leveraging NLP to analyze videos and text, generating concise summaries and discussion prompts. My work on relaxing differential privacy explored improved privacy guarantees while maintaining model performance, and I developed a novel NLP2SQL approach for efficient database interaction. Along with my academic focus, I explored cancer grading via deep learning and network anomaly detection with optimized ANNs, both published in Springer journals.

Publications

Prostate Cancer Grading using Multistage Deep Neural Networks

2021

MIND 2021 • Springer publication

Developed a novel multi-stage deep learning framework for automated Gleason system grading (GSG) and grade group (GG) classification of prostate cancer cells. This approach differs from existing methods by treating Gleason pattern (GP) classification as a classification problem combined with segmentation.
The system achieved an overall diagnostic accuracy exceeding 90% f1-score for each CNN, demonstrating its effectiveness in GSG and GG classification. Additionally, the paper reports precision and recall values for each GP, both exceeding 90%, beating the state-of-the-art techniques.
This DL-based system holds promise for enhancing objectivity and efficiency in prostate cancer diagnosis, potentially leading to improved patient outcomes.

Network Anomaly Detection using Artificial Neural Networks Optimised with PSO-DE Hybrid

2018

SSCC 2018 • Springer publication

Proposed a novel hybrid PSO-DE algorithm that combines Particle Swarm Optimization and Differential Evolution to optimize ANNs for network anomaly detection. The hybrid PSO-DE algorithm leverages the complementary advantages of both techniques to achieve better exploration and exploitation capabilities.
The optimized ANNs significantly improved anomaly detection accuracy compared to traditional methods. The paper reports an accuracy of 98.7%, which significantly improves the accuracy of traditional ANN-based methods.
The paper used the NSL-KDD dataset, a standard benchmark dataset for network intrusion detection. The results show that the proposed approach can effectively detect anomalies in real-world network traffic.

Analysis of Neuro-Symbolic AI for Cognitive, Linguistic, and Philosophical Applications

2024

Conducting a comparative study on Neuro-Symbolic Knowledge Distillation, Neuro-Symbolic Commonsense Reasoning and Cognitive Architecture, identifying the common ground and addressing their shortcomings with each other's help.
Proposed a novel architecture combining the concepts of the above three research fields for Situated Reasoning about norms, intents, and actions. The goal is to utilize the power of Neuro-Symbolic AI to enhance linguistic abilities of current LLM models.

Whispers of the Heart - Sentiment Analysis of Journal for Therapeutic Assistance

2024

Developed the Whispers of the Heart, a daily journaling app that encourages people to journal their thoughts and emotions along with sharing various artforms (quotes, poetry, paintings, or songs) that they relate to.
Trained and fine-tuned the Gemini 1.5 Pro multimodal model to analyze the journal entries for the sentiment of the person. The analysis will then be shared with the therapist for a daily insight into the person’s mind. Ensured the journal's privacy from the therapist while providing insightful information on the emotions
• Integrated explainability in the model to articulate its reasoning process, breaking down its interpretation step by step.

Academic Video Summarizer and Enhancer using Retrieval Augmented Generation with LLMs

2024

Developed a novel LLM-based pipeline for generating informative video summaries. This approach leverages Retrieval Augmented Generation (RAG) to incorporate external knowledge from reading materials, resulting in summaries that are more comprehensive and student-friendly than traditional methods.
Implemented few-shot learning and prompt-based fine-tuning to enhance the performance of pre-trained LLMs. This strategy addresses the limitations of pre-trained models and significantly improves the quality and accuracy of the generated video summaries.
Undertook a user survey demonstrating that the proposed solution outperforms existing video summarization techniques. This empirical validation highlights the approach's effectiveness in creating superior summaries for educational purposes.

Relaxation of Differential Privacy in Machine Learning

2020

Analyzed the importance of Differential Privacy (DP) in Machine Learning, zoning in on the convex bounds for privacy and accuracy in the models.
Conducted a comprehensive literature survey and investigated the convex properties of relaxed privacy guarantees, with a particular focus on Renyi DP. This research lays the groundwork for the application of these methods in machine learning model security and privacy.
Proposed a Renyi-DP-inspired Gradient Descent algorithm for Neural Networks, demonstrating its potential to balance individual privacy with strong model performance.

Scalable Sentiment Analysis for Real-Time Product Feedback (using Seq2Seq + LSTM)

2019

Implemented a Seq2Seq model with LSTMs to analyze product-based tweets, enabling real-time sentiment analysis for multiple businesses at Sprinklr. This approach captured contextual nuances in tweets, leading to more accurate sentiment identification than traditional methods.
Leveraged sentiment analysis to generate actionable product feedback for Sprinklr clients. Businesses gained valuable insights to improve customer experience and product satisfaction by analyzing tweet sentiment towards specific products.
Pioneered using model compression techniques (teacher-student model and pruning) to create lightweight NLP models. This innovation enabled the deployment of sentiment analysis models on mobile platforms, extending the reach and accessibility for Sprinklr's clients.