About Me
Hello! I work as a Senior Research Scientist at the Speech-NLP Team in Reliance Jio AICoE.
I specialise in working on different projects related to audio and text data in sports, telecom, health-care, and legal domains.
I have a PhD from the Department of CSE at IIT Guwahati. During my research, I focused on analyzing the performance of cricket players. I achieved this by analyzing the live ball-by-ball Text Commentary to identify their strengths and weaknesses.
I have presented my research works at renowned conferences such as ICMLA 2019, ESANN 2020, MLSA 2020, CMSAC 2020, IJCNN 2022, and INTER-SPEECH 2023.
I have a M.Tech degree in CSE from IIT Guwahati, where I have worked in a fascinating convex optimization problem - 'spectral clustering using convex and constrained settings'.
Active Research Areas: Audio Question Answering, Automated Audio Captioning, Language Based Audio Retrieval, Adversarial Attacks on Deep Models, Large Language Models, Legal AI.
News & Update
- June 2023: Awesome Audio Question Answering [Link]
- Nov 2023: Awesome Audio Visual Question Answering [Link]
Areas of Expertise
- NLP: Conversational AI, Document Summarization, Sentiment Analysis, Machine Translation, Question Answering, Named Entity Recognition, Semantic Role Labeling, Cross-Lingual Understanding, Low-Resource Language Processing, Adversarial Attacks and Defenses, Generative AI, and Large Language Models (LLM).
- Audio: Multimodal Audio Analysis, Sound Event Detection and Classification, Automatic Audio Captioning, Language-Based Audio Retrieval, Audio Question Answering, Adversarial Attacks and Defenses in Audio, Generative Audio Models, Music Information Retrieval, and Environmental Sound Analysis.
- Computer Vision: Image Captioning, Visual Question Answering, Audio Visual Question Answering, Transfer Learning and Pretrained Models, and Cross-Modal Integration.
Experience
Senior Research Scientist, Reliance Jio AICoE, Hyderabad, India (Sep 2021 - Present)
- Call Audit Automation
- Details: Automated system for extracting both speech and text-based analytics from the daily influx of calls received at the call center.
- Contributions: Established a pipeline dedicated to collecting, processing, normalizing, and augmenting call center text data. Trained probabilistic language models for ASR systems, leveraging the generated text. Executed sentiment analysis with reasoning on call center call transcripts to uncover customer and agent sentiments as well as reasons both during and after calls, with the aim of enhancing the overall customer experience.
- RF Hospital ASR
- Details: A tool created to assist doctors in converting spoken patient information into written notes.
- Contributions: Set up a dedicated pipeline for the collection, processing, normalization, and augmentation of clinical text data. Trained probabilistic language models to enhance ASR systems, using the generated text data.
- PDF Chatbot
- Details: LLM-powered chatbot system specialized in processing and extracting information from PDF documents. It allows users to ask questions, give commands, or obtain feedback from their PDF files.
- Contributions: Executed the project's end-to-end implementation, covering chatbot architecture design, NLP integration, and PDF parsing capabilities. The chatbot efficiently extracts, interprets, and provides information from PDF files, streamlining document retrieval and enhancing user experiences with large volumes of PDF data.
- Text to SQL
- Details: AI system powered by LLM for converting user queries in natural language into SQL queries. It enables users to express their database queries in plain language, improving query accessibility and usability.
- Contributions: Developed two frameworks utilizing LLM for Text to SQL conversion. Designed the architecture, integrated NLP techniques, and optimized the models for accurate SQL query generation.
- RASA Voice Bot
- Details: A voice-bot capable of engaging in human-like dialogue, capturing context, and delivering intelligent responses.
- Contributions: Developed a voice assistant by using the RASA framework. This involved creating conversational models, context management, and integration of natural language understanding for contextual responses.
Research Intern, Reliance Jio AICoE, Hyderabad, India (June 2021 - Aug 2021)
- Cricket Analytics for Mumbai Indians: Developed an end-to-end framework for cricket text commentary collection (from ESPNcricinfo and IPLT20 websites), processing, normalization, and augmentation for ASR.
Teaching Assistant, IIT Guwahati, India (July 2013 - Dec 2020)
- Courses: Software Engineering (Fall-2018, Fall-2019), Design and Analysis of Algorithms (Fall-2017), Computer Vision using Machine Learning (Fall-2016), Discrete Mathematics (Fall-2015), Probability and Linear Algebra (Spring-2014), Data Communication (Fall-2013).
- Labs: Database (Spring-2015, Spring-2016, Spring-2020, Fall-2020), Computing (Spring-2017, Spring-2018, Spring-2019), Data Structures (Fall-2014).
Publications
-
Cricket Player Profiling: Unraveling Strengths and Weaknesses Using Text Commentary Data
S. R. Behera and V. S. Vedula
Arxiv
[paper]
[code]
-
AQA-LLM: A Scalable Automated AQA Data Generation Framework Using Large Language Model
S. R. Behera, K. M. Injeti, J. S. K. Patibandla, P. K. Pokala, A. M. Tripathi, P. B. Reddy, , G. Duggal, and S. R. M. Prasanna
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024
[paper]
[code]
[core rank = A*] (Submitted)
-
Towards Multi-Lingual Audio Question Answering
S. R. Behera, P. B. Reddy, A. M. Tripathi, B. R. Megavath, and T. Karavadi
Conference of the International Speech Communication Association (INTERSPEECH), 2023
[paper]
[code]
[core rank = A*]
-
Reverse Adversarial Attack To Enhance Environmental Sound Classification
A. M. Tripathi, S. R. Behera, and K. Paul
IEEE International Joint Conference on Neural Networks (IJCNN), 2022
[paper]
[code]
[core rank = A]
-
K-Defensive Bit Planes: Defense Against Adversarial Attacks
A. M. Tripathi, S. R. Behera, and K. Paul
IEEE International Joint Conference on Neural Networks (IJCNN), 2022
[paper]
[code]
[core rank = A]
-
Investigation of Performance of Visual Attention Mechanisms for Environmental Sound Classification: A Comparative Study
A. M. Tripathi, S. R. Behera, and K. Paul
IEEE International Joint Conference on Neural Networks (IJCNN), 2022
[paper]
[code]
[core rank = A]
-
Adv-IFD: Adversarial Attack Datasets for An Intelligent Fault Diagnosis
A. M. Tripathi, S. R. Behera, and K. Paul
IEEE International Joint Conference on Neural Networks (IJCNN), 2022
[paper]
[code]
[core rank = A]
-
Learning Player-specific Strategies Using Cricket Text Commentary
S. R. Behera
PhD Thesis, 2021
[phd thesis]
-
Mining Temporal Changes in Strengths and Weaknesses of Cricket Players Using Tensor Decomposition
S. R. Behera and V. S. Vedula
European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), 2020
[paper]
[code]
[core rank = B]
-
Learning Strength and Weakness Rules of Cricket Players using Association Rule Mining
S. R. Behera and V. S. Vedula
Machine Learning and Data Mining for Sports Analytics (MSLA), ECML-PKDD Workshop, 2021
[paper]
[code]
-
Performance Analysis of Batsman against Spin Bowling and Fast Bowling in Cricket
S. R. Behera
Ohio State Sports Analytics Association Conference (OSUSAAC), 2020
[paper]
[code]
*Best Research Award*
-
Stats Aren't Everything; Learning Strengths and Weaknesses of Cricket Players
S. R. Behera and V. S. Vedula
Machine Learning and Data Mining for Sports Analytics (MSLA), ECML-PKDD Workshop, 2020
[paper]
[code]
-
Video Data Do More. Tracking Data Do Much. Text Commentary Data Do Much More
S. R. Behera and V. S. Vedula
Carnegie Mellon Sports Analytics Conference (CMSAC), 2020
[paper]
[code]
-
Mining Strengths and Weaknesses of Cricket Players Using Short Text Commentary
S. R. Behera, P. Agrawal, A. Awekar and V. S. Vedula
IEEE International Conference On Machine Learning And Applications (ICMLA), 2019
[paper]
[code]
[core rank = C]
Web Applications
Education
- PhD in Computer Science and Engineering, IIT Guwahati, India, July 2015 - Sept 2021
- Thesis: Learning Player-specific Strategies using Cricket Text Commentary.
- M.Tech in Computer Science and Engineering, IIT Guwahati, India, July 2013 - June 2015
- Thesis: Spectral Clustering Using Convex and Constrained Settings.
- B.Tech in Computer Science and Engineering, VSSUT, Burla, India, July 2008 - June 2012
- Thesis: A Novel Ontology Based Entity Relationship Model.
Programming Skills
- Languages: Python, R, C, Matlab, SQL.
- Others: PyTorch, FastText, spaCy, Flair, AllenNLP, TextBlob, Core NLP, Gensim, NLTK, Huggingface, Fairseq, Pandas, NumPy, SciPy, Scikit-learn, Seaborn, Matplotlib, Plotly, R Shiny.
Miscellaneous
- Best Research Award: Ohio State Sports Analytics Association Conference (OSUSAAC), 2020, Columbus, USA.
- GATE 2013: All India Rank 696 (99.68 percentile).
- Program Committee Member: ECML-PKDD 2020.
- Reviewer: IEEE VIS 2020, IEEE VIS 2021, IEEE VIS 2022.
- Grants and Fellowships: MHRD Government of India Fellowship for MTech and PhD.
- Organizer: Advaya 2015, PG cultural festival at IIT Guwahati.
- Technical Officer: Student gymkhana council 2014-2015 at IIT Guwahati.
- Email: swarupranjanbehera@gmail.com
- Address: Hitech City, Hyderabad, India