Detecting Financial Fraud at Scale with Machine Learning - Elena Boiarskaia (H2O ai)


Summary

The video explores the evolution of fraud detection, moving from rule-based systems to machine learning and Spark technology. It emphasizes the advantages of using machine learning for faster adaptation to changing fraudulent behaviors and the collaboration between data engineers, data scientists, and business users. The importance of interpretability and accuracy in machine learning models, along with experimentation with different models and performance evaluation metrics, is showcased. The process of model selection, evaluation, and deployment at scale using Spark and the integration of H2O models with Spark through the Sparkling Water wrapper is explained. Additionally, the video touches upon feature interpretation with SHAP values, model tracking with MLflow, and the strategies for effective fraud detection, including active learning and human involvement in reviewing model predictions.


Introduction to Fraud Detection

The speaker introduces the topic of fraud detection, highlighting the transition from legacy detection systems to machine learning and spark technology.

Legacy Detection Systems

Discusses the challenges and limitations of legacy detection systems, focusing on the reliance on rule-based models and the lengthy process of development and deployment.

Transition to Machine Learning

Explains the shift towards machine learning for fraud detection, emphasizing the need for faster adaptation to changing fraudulent behaviors and the collaboration between data engineers, data scientists, and business users.

Utilizing Spark and Machine Learning

Describes the advantages of using Spark and machine learning models for fraud detection, showcasing the process of model selection, evaluation, and deployment at scale.

Interpretability and Accuracy in Models

Explores the importance of interpretability and accuracy in machine learning models for fraud detection, including considerations for false positives, false negatives, and model explainability.

Experimentation with Models

Details the experimentation process with different models, data balancing techniques, and performance evaluation metrics to optimize fraud detection results.

Sparkling Water Wrapper and Model Deployment

Introduces the Sparkling Water wrapper for integrating H2O models with Spark, showcasing how to leverage this tool for model training, testing, and deployment.

Feature Interpretation and Model Tracking

Explains the use of SHAP values for feature interpretation, demonstrates model tracking with MLflow, and highlights the capabilities of monitoring and managing models effectively.

Final Remarks and Q&A

Concludes the discussion on fraud detection strategies, including active learning, model adaptability, and the human component in reviewing model predictions. The session ends with the speaker addressing final questions from the audience.

Logo

Get your own AI Agent Today

Thousands of businesses worldwide are using Chaindesk Generative AI platform.
Don't get left behind - start building your own custom AI chatbot now!