Professional Portfolio Mary (Miet) Loubele
Research and Academic Achievements
My research has focused on addressing critical challenges in model evaluation for advanced machine learning and algorithm development. During my PhD, I developed a comprehensive protocol for constructing effective test sets and ground truth, balancing precision levels with associated costs. This work has garnered significant recognition:
- 2814 citations on publications 
- H-index of 10 
- i10 index of 10 
Initially applied to dental scanners, my protocol optimized bone model quality while minimizing radiation dose. I identified the i-CAT scanner as the most efficient option, balancing image quality and radiation exposure. This success led to broader applications in machine learning and algorithm development.
Industry Experience and Project Highlights
My industry experience spans various projects applying advanced machine learning and algorithm development to real-world problems:
- Evaluated algorithm porting accuracy between Oracle SQL and Microsoft SQL Server 
- Developed multi-level test sentences for personal assistant ML algorithm evaluation 
- Employed Monte Carlo simulations to assess big data solution performance 
- Expanded LSTM (neural net-based search) solution training data from 2 to 300 shopping malls 
- Evaluated ML algorithms at Facebook and Instagram for both video feed and classification to enhance user experience 
- Built out training corpora for the use in RAG and fine-tuning of Large Language Models (LLMS) 
- Performed data science analysis for 25 years of vulnerabilities 
- Protection against threats in the real world presented at Cisco Offensive Summit 2016 
Complex System Upgrades
I have a strong track record in upgrading legacy systems and improving operational efficiency:
- Led data engineering framework upgrades for privacy and security compliance 
- Part of a team that resolved the "cold start" problem for video distribution during for a new product at Facebook during a war room 
- Optimized a training data cleanup tool, reducing processing time from days to minutes 
- Facilitated smooth transitions for ML solution upgrades affecting hundreds of developers 
Natural Language Processing (NLP) Expertise
With 3 years of specialized NLP experience, I have:
- Applied Support Vector Machines for primary classification, increasing precision with 5% using confusion matrix 
- Led system localization, translating a complete NLP system from English to French 
- Specialized in active learning techniques for 2 years 
- Developed rule-based classification systems and extended training data 
- Leveraged crowdsourcing for data collection 
Machine learning and Classification Systems
I have successfully bootstrapped secondary classification systems and built zero-to-one products:
- Developed a contact normalizer based on occupation data 
- Built a horizontal classification system for creator categorization 
- Normalized course titles at D2L 
- Created a system to predict sales call success 
- Developed classification systems for blog post effectiveness and global feed rating 
- Built a system to classify SQL queries for interview preparation using generative AI 
Cost Optimization and Efficiency Improvements
Throughout my career, I have consistently optimized costs and improved efficiency:
- Managed cloud budgets, avoiding excessive costs 
- Achieved a 75% reduction for the data warehouse costs at my team 
- Led a project to successfully address escalating costs for the Instagram data warehouse 
- Improved tool efficiency, reducing processing time from days to minutes 
- Automated manual processes for ML solution precision evaluation 
- Streamlined data retrieval from Salesforce data for data scientists 
Generative AI and Content Creation
- Developed a corpus of blog posts for a Retrieval Augmented Generation (RAG) system 
- Built proof-of-concept content creation engines for efficient blog post distribution 
- Implemented a classification system to assess media feed health, resulting in a 22% increase in followers and over 150k impressions in six months 
Conference Talks and Community Leadership
I have actively contributed to the tech community through organizing meetups, giving talks, and engaging in various events:
Meetup Organization
- Lead Organizer @ Intersections K∩W Meetup (August 2015 - March 2020) - Grew membership from 120 to over 960 members 
- Organized more than 40 meetups focused on mathematics, computer science, and data science 
- Negotiated sponsorships from regional tech companies 
- Mentored aspiring data scientists in the KW area 
 
- Organizer @ Waterloo Data Science and Data Engineering Meetup (November 2017 - March 2020) - Increased membership from 441 to 503 members in two months 
- Collaborated on organizing monthly meetups 
- Secured venues and sponsorships 
 
Conference Talks and presentations
- DEML Summit 2024: "Evolution of data engineering interviews over the last 14 years" 
- Protection against threats in the real world presented at Cisco Offensive Summit 2016 
- Communitech Panel Discussion: "Practical Applications of Artificial Intelligence" 
- Annual Women in Data Science Conference 2018: "How do data teams operate in lean startups" 
- Google Meetup 2018: "Data Pipelines in AI for SAAS applications" 
- Ryerson University Invited Lecture: "Understanding non-technical skills in data teams in industry" 
- Super data science podcast: "The Amazing world of Data Science Meetups" 
- Shopify event: lightning talk + panel discussion: "How to use online data to grow your local community" 
- Toronto Machine Learning Micro Summit: "Iterative strategies for a neural-net based search solution" 
- Startup analytics Podcast: "A Data Engineer Career Path | Startups to FAANG | Academia to Industry"