Ankesh Kumar
With a blend of data engineering foundations and three years of dedicated experience in the field, I am currently enhancing my expertise through a Master’s in Computer Engineering with a concentration in Machine Learning at Northeastern University. My journey is driven by a deep-seated passion for leveraging data to innovate, streamline processes, and uncover valuable insights. This unique combination of skills and experiences positions me as a dynamic candidate eager to tackle challenges in Data Engineering, Data Analytics, and Machine Learning internships, aiming to contribute meaningful solutions and innovations across these domains.
Email – kumar.anke@northeastern.edu
Contact – +1-339-777-2489
Git link – https://github.com/ankesh86
Blogs – https://www.knowledgehut.com/blog/author/ankesh-kumar

Work Experience
Guardian Life Insurance Company of America, Gurgaon, India – May 2022-Aug 2023 – Business Intelligence Developer
- Developed and configured custom feeds and reports for an Absence Management project, utilizing SQL, SSIS, and SSRS to meet client requirements over the reporting server.
- Engaged in the maintenance and troubleshooting of SSIS packages, serving as an SQL developer to ensure system integrity and performance.
- Supported the transaction layer of an ASP.NET application, effectively managing a hub-spoke model database on the Transaction server.
- Ensured seamless data integration flow from the user interface to the transaction server, and from the transaction server to the reporting server, maintaining data consistency and accuracy across layers.
- Deployed SSIS packages for efficient data transfer between the transaction layer and the reporting layer, optimizing data accessibility and report generation.
Tata Consultancy Services, Gurgaon, India – Nov 2020-May 2022 – Business Intelligence Developer
- Developed and analyzed ETL processes using SAS Studio and MSBI tools (SSIS, SSRS, PowerBI) for an airlines analytics project, streamlining data integration, building pipelines and reporting.
- Engineered data integration solutions with SSIS, SSRS, and SAS for enterprise and customer data warehousing, improving data management and analytics capabilities.
- Managed and maintained over 200 ETL flows across Azure Data Factory, SQL Server Agent, and SAS Viya, ensuring efficient data processing and reliability.
- Directed the successful upgrade of the Enterprise Data Warehouse, transitioning and decommissioning SSIS and SQL pipelines from the 2008 to the 2016 version, enhancing system performance and future readiness.
- Monitored Data Warehouses’ ETL flows, resolved repetitive issues by root cause analysis, and optimized server loads through strategic SQL job scheduling, ensuring high data integrity and system efficiency.
Skills
Technical Skills: Data Engineering (Microsoft Azure Certified), Data Analysis (Microsoft Certified Power BI Analyst), Machine Learning (ML Operations), Data Cleaning and Pre-processing, Data Modeling, Data Visualization, Optimization Algorithms, Data Lake, Data Warehousing, Cloud Fundamentals (Azure), IT Services
Programming Languages: T-SQL, Python, PySpark, C++, SAS
Software Tools: Databricks, MS SQL Server, SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), Power BI, SAS Enterprise Guide, SAS Viya, MLFlow, Airflow, MATLAB/Simulink, MS Office (Word, Excel)
Other Skills: Technical Content Writing, Data Studies and Discovery, Troubleshooting and Optimization of Stored Procedures, ETL Process Implementation, Cross-Functional Collaboration, Documentation (Source-to-Target Mappings, Data Pipelines, Transformations), Participation in Design Sessions and Code Reviews
Education
Northeastern University
Masters in Electrical and Computer Engineering
Concetration in Machine Learning
Session – Fall 2023 – Summer 2025
Overall GPA – 3.7 / 4.00
Boston, MA 02115
Power System Transients – Modelling and Simulation – Fall 2024
The course presents computer modeling of linear and nonlinear power system components to be used in transient studies. Methods of digital simulation of power systems operating in the steady-state and transient conditions will be covered. Use of the Alternative Transients Program (ATP) for design and analysis of power systems is discussed.
Carried out a term project and deliver a presentation about its outcome.
Machine Learning – Small Data – Fall 2024
This course focused on advanced deep learning techniques tailored for small data scenarios, addressing challenges in domains like healthcare and defense where large datasets are impractical.
It covered transfer learning, weak supervision, zero- and few-shot learning, data augmentation (including physics-based), and meta-learning. Emphasis was placed on practical applications and theoretical foundations such as VC dimension and kernel methods.
Introduction to Machine Learning – Summer 1 – 2024
Studies machine learning (the study and design of algorithms that enable computers/machines to learn from experience/data). Covers a range of algorithms, focusing on the underlying models between each approach.
Emphasizes the foundations to prepare students for research in machine learning. Topics include Bayes decision theory, maximum likelihood parameter estimation, model selection, mixture density estimation, support vector machines, neural networks, probabilistic graphics models, and ensemble methods (boosting and bagging). Offers students an opportunity to learn where and how to apply machine learning algorithms and why they work.
Data Visualization – Spring 2024
Introduction to relevant topics and concepts in visualization, including computer graphics, visual data representation, physical and human vision models, numerical representation of knowledge and concept, animation techniques, pattern analysis, and computational methods. Tools and techniques for practical visualization.
Elements of related fields include computer graphics, human perception, computer vision, imaging science, multimedia, human‐computer interaction, computational science, and information theory. Covers examples from a variety of scientific, medical, interactive multimedia, and artistic applications. Hands‐on exercises and projects
Fundamental of Computer Engineering – Spring 2024
Introduces fundamental techniques in computer engineering used throughout the graduate curriculum. Covers basic programming and analysis methods and the formulation and solution of a wide range of computer engineering problems. Also discusses the applications of algorithm analysis and complexity theory to analyzing and solving problems.
Emphasizes those fundamental computational problems and related algorithms whose solution can be obtained in polynomial time. For basic computational problems such as sorting, searching, elementary graph algorithms, shortest-paths problems, as well as flow problems in networks, many different algorithms and data structures are described and analyzed, implemented, and compared both from a theoretical and from an experimental point of view.
Power System State Estimation – Fall 2023
Grade – A-
Course Objective – Offers an up-to-date account of the strategies utilized in state estimation of electric power systems. Provides a broad overview of power system operation and the role of state estimation in overall energy management. Presents an abundance of examples, models, tables, and guidelines to clearly examine new aspects of state estimation, the testing of network observability, and methods to assure computational efficiency.
Coursework Project –
Electric Vehicles – Fall 2023
Grade – A
Coursework – This course is designed to familiarize the students with electric vehicle (EV) powertrain. It will cover different types of energy storage elements used in EV and hybrid electric vehicles (HEV), bidirectional DC-DC converters and their control, inverters, different power converter topologies
for on-board charging, wireless power transfer for battery charging, and different types of electric motors that are used in EVs
Coursework Project –
JSS Academy of Technical Education
Session – 2016-2020
Bachelor of Technology in Electrical and Electronics
CGPA – 3.7/4
Noida, India
Projects
(May 2022)
Absence management – Reporting and Feeds.
Project presented at Guardian Life Insurance Company of America
- Designing and developing custom feeds and reports for clients.
- Maintaining and troubleshooting the SSIS, SSRS, and SQL packages/ components for any issue.
(May 2022)
Airlines data analysis product development.
Project presented at TCS
- Analyzing the SAS Studio (SAS Data Integration Studio, Enterprise Guide) ETL processes and developing the new ETL products in MSBI (SSIS, SSRS, PowerBI) and SQL server for the data analytics team.
- Review the business requirement of ETL flows and design the new interfaces as per the architecture.
(May, 2021)
Migration of Enterprise Datawarehouse from SQL 2008R2 to SQL 2016.
Project presented at TCS
Environmental setup for windows from 2008 to 2016 and migration of SSIS packages by re-confuguring/ upgrading the components from SQL 2008R2 to SQL 2016 and testing their compatibility.
(Feb, 2021)
Application support for EDW/ CDW.
Project presented at TCS.
Monitoring, the ETL flows for the Datawarehouse deployed in the SQL Server agent job, Creating the maintenance changes through ServiceNow, fixing the repetitive problems by analyzing the root causes and reporting the application status, and scheduling the SQL jobs to maintain the load on the server.
Academic Projects
InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models – Sept 2024- Dec 2024
At – Northeastern University, Boston MA | Visit here – Project
Tech Stack: Python, Pytorch
Project Description:
- Developed and evaluated an interaction-aware text-to-image diffusion model by integrating a pluggable module (InteractDiffusion) into Stable Diffusion v1.4, enabling fine-grained human-object interaction control in generated images using HICO-DET dataset.
- Implemented interaction conditioning using a novel transformer-based module incorporating tokenization, spatial embeddings, and role-aware attention to guide image generation from HOI triplets (subject-action-object).
- Trained and benchmarked the model with FID, KID, and HOI detection mAP metrics, achieving improved interaction fidelity and image quality over baselines like GLIGEN and Stable Diffusion.
- Conducted ablation studies to assess the contribution of each interaction module component, showing progressive improvement in controllability and realism of synthesized images.
Insights from CREZ Reactive Compensation Study and EMTP study on Sub-Synchronous Resonance – Sept 2024- Dec 2024
At – Northeastern University, Boston MA | Visit here – https://ankeshkumar.sites.northeastern.edu/files/2025/06/Slides_Reactive_Compensation_Studies.pdf
Tech Stack: ATP (Alternative Transients Program), Power System Studies, Power System Stability, Transient Stability, Modeling and Simulation
Project Description:
- The large-scale integration of wind generation into the grid has led to increased reactive power requirements and associated stability challenges. This project examines the “CREZ Reactive Compensation Study,” which highlights the effects of Series Capacitance and Shunt Reactive Compensation.
- The study identifies potential challenges in implementing reactive compensation solutions and proposes mitigation strategies. A key issue identified was the potential for sub synchronous interactions, including sub-synchronous resonance (SSR) and sub-synchronous control interaction (SSCI), which can destabilize the system and cause mechanical damage to turbine-generator shafts.
- To investigate the SSR effect, the IEEE Second Benchmark System for SSR Studies was utilized, and several case studies were conducted. Alternative Transients Program (ATP) simulations analysed system behavior under varying conditions, including blocking capacitor values, series compensation levels, fault timings, and a torque amplification study.
- The results demonstrated the necessity of transient simulation for ensuring system stability. These insights are valuable for planning SSR mitigation and maintaining system stability.
Detecting AI-Generated Text: A Hybrid Approach of Classical Machine Learning And Deep Learning – May 2024-Jun 2024
At – Northeastern University, Boston MA | Visit here – https://github.com/rahulmk8055/AI-vs-Human-Text-Classification
Tech Stack: Python, Classical ML (Logistic Regression, Support Vector machines) and LLM-Hypertuning (Transformers)
Project Description:
This project aimed to distinguish between ChatGPT-generated and human-generated text across multiple domains. Through our research and implementation, we have made significant strides in understanding the unique characteristics of AI-generated content and developing effective detection methods.
- Key findings include:
- Finding key features of text
- Implementation of Classical ML approach (Logistic Regression and SVM) for the binary classification problem
- Implementation of fine-tuning LLM (RoBERTa) for the classification problem
Although we used a standard dataset for train-test splitting, the models achieved high accuracy, with simple Logistic Regression reaching nearly 95%.
The project demonstrates that while fine-tuning RoBERTa for text classification achieves high accuracy on the training data, it struggles with unseen data due to overfitting. Combining RoBERTa’s hidden states with Logistic Regression improves performance on validation data, highlighting the potential of hybrid models.
Visualization Tool for Optimal Distributed Generation (DG) Placement in Power Distribution Networks – Feb 2024-Apr 2024
At – Northeastern University, Boston MA | Visit here – https://optimaldgplacementdashboard.streamlit.app/
Tech Stack: Python (Streamlit, Plotly, Altair, NetworkX, Pandas), MATLAB, Data Visualization, Power Systems Analysis, Optimization Algorithms (Exhaustive Search, Particle Swarm Optimization)
Project Description:
- Developed an advanced visualization platform to aid in strategic placement and sizing of Distributed Generation (DG) units within power distribution networks, utilizing IEEE standard bus systems for precise scenario analyses.
- Leveraged Python’s Streamlit interactive tool and data visualization libraries to create an intuitive user interface for visualizing the distribution network, power flow metrics, and the impact of DG integration.
- Implemented robust computational techniques, including Exhaustive Search and Particle Swarm Optimization algorithms, to determine the optimal configuration for DG units, minimizing power loss and ensuring system stability.
- Integrated MATLAB scripts for accurate power flow calculations, taking into account the DG size at each bus, ensuring reliable simulation results.
- Enabled users to adjust DG sizes and locations through a user-friendly interface, dynamically updating the displays and metrics in real-time, facilitating informed decision-making.
- Visualized the relationship between DG size, bus number, and active power loss using interactive 3D mesh grids, highlighting the selected parameters and color-coding metric changes for easy interpretation.
- Animated the optimization process using scatter plots, allowing users to understand the comparative efficiency of their solutions against potential alternatives.
- Provided actionable insights based on calculated optimization strategies, displaying optimal metrics such as bus number, DG size, and revised power losses.
- This project demonstrates proficiency in data visualization, power systems analysis, and optimization techniques, showcasing the ability to develop innovative solutions for efficient and sustainable power distribution networks.
Design of IEEE 118-Bus system for Observability – Nov 2023-Dec 2023
At – Northeastern University, Boston MA | Visit here – https://ankeshkumar.sites.northeastern.edu/files/2025/06/Design-of-IEEE118-bus-for-Observability-1.pdf
- Implemented the SCADA measurement placement on the PET software for observability analysis.
- MATLAB code developed to facilitate planning and calculation processes utilising Integer Linear Programming. It incorporates observability analysis, sensitivity matrix computation for critical analysis, and cost estimation for installation.
Optimization of placement of DG in Distribution network – Jan 2019-Feb 2020
At – JSS Academy of Technical Education – Noida, India
- Found the optimal Distributed Generators (DG) sizes and locations to maintain power loss, voltage stability index, and voltage deviation using analytical and heuristic methods (Optimization algorithms such as PSO, GA etc.) using MATLAB.
- Implemented “Jaya algorithm”, a parameter independent algorithm to solve the optimal DG placement problem which resulted in 75% reduction of power loss.
Control of Induction Machines for Electrical Vehicles – May 2019-July 2019
At – Indian Institute of Technology – Jodhpur, India
- Implemented Direct Torque Control (DTC) technique using switching by Space Vector Modulation (SVM) for voltage source inverter (VSI) for an induction motor on Simulink and MATLAB.
- Modeled induction motor for transient and steady-state performance, implemented model DTC scheme to the motor and presented the simulation results.
Publications
Allocation of DG in Distribution Network Using Parameter Independent JAYA Algorithm [JOURNAL-SCOPUS]
Journal of Electrical Engineering Technology, 2022, doi:10.1007/s42835-022-01309-7.
Abstract – Integration of DGs in the distribution network has led to challenges in making the system reliability, and other parameters improve. This paper gives a detailed discussion on the placement of DG in the distribution system using the new algorithm ‘JAYA’ having no control parameter which makes the algorithm simple than previous used algorithms and also give best solutions according to the study. Three system parameters are considered for DG placement solution: active power loss reduction, voltage deviation reduction and voltage stability index maximization. The methodology is presented to calculate the optimal values for varying load profiles i.e. constant, residential, commercial and industrial. Type-1 DG placement is considered for placement. Simultaneous improvement of all factors using weight based multi-objective function is used. Standard test system IEEE-33 and 69 bus is used to validate the applicability of the methodology and compared to other optimization methods, the results presented proves the JAYA to be better optimization of optimal DG calculation as the method is simple, feasible and have good applicability with other nature inspired and meta-heuristic methods.
DG Placements in Distribution Network using Exhaustive Search approach [CONFERENCE]
2019 International Conference on Electrical, Electronics and Computer Engineering (UPCON),India, doi:10.1109/UPCON47278.2019.8980255
Abstract – The objective is the strategic placement of multiple DGs of different types to reduce the active power loss of the overall distribution network by choosing the optimal size and location of the DG. As study shows that randomly selecting location and sizing leads to higher power loss than network without DGs. This paper discusses an exhaustive search approach based on direct load flow method to find the location and size for optimal power loss of multiple DGs of all types in the distribution systems. The proposed method is developed and IEEE standard test systems (16-bus and 33-bus) is used to test the methodology.
Optimal location and sizing of Photovoltaic DG system using Direct Load Flow method [CONFERENCE]
In: National Conference on Smart Energy Systems, 2019. “ISBN: 978-93-5361-694-6
Inappropriate selection of location and corresponding size of Distributed Generator (DGs) in electrical network may have greater system losses than losses without DG. Application of incorporating DG in system has eased the problem of high power losses, voltage stability, low reliability and poor power quality. This paper suggests a simple and efficient load flow technique known as direct load flow method to find the optimal location and size of Type-3 DG in the distribution system. The proposed methodology was developed and tested in two distribution system with varying size and complexities and the effect of size and location of DG with respect to real power losses while maintaining the voltage profile of system within limits is examined with verification and discussed in detail.
Extra-Curricular
- Selected among Top 10 candidates for the research proposed on the topic – DC/DC & DC/AC Converter control for Energy Management by Polynomial control based on Worldwide Light Vehicle Testing Procedure (WLTP) – UC and Battery for program Undergraduate Research Initiative – 2019 by IIT Jodhpur to get funded research at the institute.
- Student Co-ordinator at Smart India Hackathon-2019 (Nationwide 36-hour Hackathon organized by the MHRD, Govt. of India) hosted at JSS Academy of Technical Education, Noida.
- Member of Institute Innovation Council (Established by MHRD Innovation Cell, Govt. of India) from 2018 to 2020.
- The organizer of various innovation, patent rights, and start-up talks and events/competitions among the campus students.
- Participant in the annual event conducted at Tata Consultancy Services (TCS).