„Probabilistic Forecasting with DeepAR and AWS SageMaker
EuroPython 2020 – Talk – 2020-07-24 – Parrot Data Science
By Nicolas Kuhaupt
In time series forecasting we are interested in how the time series is going to continue in the future. This is of high importance in areas like forecasting energy production from renewable resources, forecasting demand of customers or the price of products. Many forecasting algorithms provide only the prediction. However, oftentimes we are also interested in the likelihood of the prediction and how much it may vary. This is what probabilistic forecasting is for. With every forecast, we also obtain an upper and lower bound with certain probabilities. For a long time, probabilistic forecasting was limited to traditional techniques like ARIMA. DeepAR is an algorithm that allows us to combine Deep Learning techniques with probabilistic forecasting. Additionally, in contrast to training a model for each time series individually, DeepAR suggests training one large forecasting model for all related time series. The algorithm was developed by Amazon and is also provided in AWS SageMaker.
In this talk, we will understand the theoretical basics of DeepAR, have a look at a practical time series example and will demonstrate an implementation. In the end, you will be prepared to get started with your own forecasts.
License: This video is licensed under the CC BY-NC-SA 3.0 license: https://creativecommons.org/licenses/by-nc-sa/3.0/
Please see our speaker release agreement for details: https://ep2020.europython.eu/events/speaker-release-agreement/
This talk will present best-practices and most commonly used methods for dealing with irregular time series. Though we’d all like data to come at regular and reliable intervals, the reality is that most time series data doesn’t come this way. Fortunately, there is a long-standing theoretical framework for knowing what does and doesn’t make sense for corralling this irregular data.
Irregular time series and how to whip them
History of irregular time series
Statisticians have long grappled with what to do in the case of missing data, and missing data in a time series is a special, but very common, case of the general problem of missing data. Luckily, irregular time series offer more information and more promising techniques than simple guesswork and rules of thumb.
Your best options
I’ll discuss best-practices for irregular time series, emphasizing in particular early-stage decision making driven by data and the purpose of a particular analysis. I’ll also highlight best-Python practices and state of the art frameworks that correspond to statistical best practices.
In particular I’ll cover the following topics:
Visualizing irregular time series
Drawing inferences from patterns of missing data
Correlation techniques for irregular time series
Causal analysis for irregular time series
Slides available here: https://speakerdeck.com/aileenanielsen/irregular-time-series-and-how-to-whip-them
Recurrent Neural Networks in Python: Keras and TensorFlow for Time Series Analysis – by Matt O’Connor
A look at neural networks, specifically recurrent neural networks, and how to implement them in Python for various applications including time series (stock prediction) analysis, using popular machine learning libraries Keras and TensorFlow
http://pycon.hk/2017/topics/recurrent-neural-networks-in-python-keras-and-tensorFlow-for-time-series-analysis/ Video Rating: / 5
Moving averages are foundational concepts in Time Series Analysis and form baseline models when modeling time series data. These concepts are also used in feature engineering with traditional machine learning models and as well as streaming analytics models
In this video I will be covering
Simple Moving Average
Exponential Moving Average
Weighted Moving Average
Exponential Smoothing Weighted Average
This video is about building a model that can generate text using Keras. We are using an LSTM network to generate the text.
Please subscribe. That would make me happy and encourage me to keep making my content better and better.
The code for this video:
Website: https://gilberttanner.com/ Video Rating: / 5
Time series data, in today’s age, is ubiquitous. With the emerge of sensors, IOT devices it is spanning over all the modern aspects of life from basic household devices to self-driving cars affecting all for lives. Thus classification of time series is of unique importance in current time. With the advent of deep learning techniques , there have been influx of focus on Recurrent Neural Nets (RNN) in solving tasks related with sequence and rightly so. In this talk, I would attempt to describe the reason for success of RNN’s in sequence data. Eventually we would divert towards other techniques which should be looked into when working on such problems. I will phrase examples from healthcare domain and delve into some of the other usefull techniques that can be used from Deep Learning Domain and their usefullness.
Aditya Patel is the head of data science at Stasis and has 7+ years of experience spanning over the fields of Machine Learning and Signal Processing. He graduated with Dual Master’s degree in Biomedical and Electrical Engineering from University of Southern California. He has presented his work in Machine learning at multiple peer reviewed conferences concerning healthcare domain, across the geography. He also contributed to first generation “Artificial Pancreas” project in Medtronic, Los Angeles. In his current role he is leading the advent of smart hospitals in Indian healthcare. Video Rating: / 5
Masood Krohy at April 9, 2019 event of montrealml.dev
Title: Seq2seq Model on Time-series Data: Training and Serving with TensorFlow
Summary: Seq2seq models are a class of Deep Learning models that have provided state-of-the-art solutions to language problems recently. They also perform very well on numerical, time-series data which is of particular interest in finance and IoT, among others. In this hands-on demo/code walkthrough, we explain the model development and optimization with TensorFlow (its low-level API). We then serve the model with TensorFlow Serving and show how to write a client to communicate with TF Serving over the network and use/plot the received predictions.
Code on GitHub: https://github.com/patternedscience/time-series-tf-serving
Bio: Masood Krohy is a Data Science Platform Architect/Advisor and most recently acted as the Chief Architect of UniAnalytica, an advanced data science platform with wide, out-of-the-box support for time-series and geospatial use cases. He has worked with several corporations in different industries in the past few years to design, implement and productionize Deep Learning and Big Data products. He holds a Ph.D. in computer engineering.
This video is a production of PatternedScience Inc.
LinkedIn: https://www.linkedin.com/company/patterned-science/ Video Rating: / 5
This video was recorded in San Francisco on February 5th, 2019.
Slides from the session can be viewed here: https://www.slideshare.net/0xdata/marios-michailidis-mathias-muller-h2oai-time-series-with-h2o-driverless-ai-h2o-world-san-francisco
Driverless AI is H2O.ai’s latest flagship product for automatic machine learning. It fully automates some of the most challenging and productive tasks in applied data science such as feature engineering, model tuning, model ensembling and model deployment. Driverless AI turns Kaggle-winning grandmaster recipes into production-ready code, and is specifically designed to avoid common mistakes such as under- or overfitting, data leakage or improper model validation, some of the hardest challenges in data science. Avoiding these pitfalls alone can save weeks or more for each model, and is necessary to achieve high modeling accuracy.
Driverless AI is now equipped with time-series functionality. Time series helps forecast sales, predict industrial machine failure and more. With the time series capability in Driverless AI, H2O.ai directly addresses some of the most pressing concerns of organizations across industries for use cases such as transactional data in capital markets, in retail to track in-store and online sales, and in manufacturing with sensor data to improve supply chain or predictive maintenance.
Bio: Marios Michailidis is a Competitive Data Scientist at H2O.ai. He holds a Bsc in accounting Finance from the University of Macedonia in Greece, an Msc in Risk Management from the University of Southampton and a PhD in machine learning at from UCL . He has worked in both marketing and credit sectors in the UK Market and has led many analytics’ projects with various themes including: acquisition, retention, recommenders, fraud detection, portfolio optimization and more. He is the creator of KazAnova, a freeware GUI for credit scoring and data mining 100% made in Java as well as is the creator of StackNet Meta-Modelling Framework. In his spare time he loves competing on data science challenges and was ranked 1st out of 500,000 members in the popular Kaggle.com data competition platform. He currently ranks 3rd.
Bio: A Kaggle Grandmaster and a Data Scientist at H2O.ai, Mathias Müller holds an AI and ML focused diploma (eq. M.Sc.) in computer science from Humboldt University in Berlin. During his studies, he keenly worked on computer vision in the context of bio-inspired visual navigation of autonomous flying quadrocopters. Prior to H2O.ai, he as a machine learning engineer for FSD Fahrzeugsystemdaten GmbH in the automotive sector. His stint with Kaggle was a chance encounter as he stumbled upon the data competition platform while looking for a more ML-focused platform as compared to TopCoder. This is where he entered his first predictive modeling competition and climbed up the ladder to be a Grandmaster. He is an active contributor to XGBoost and is working on Driverless AI with H2O.ai.
In the first half of this video, Jo-Fai will share his joyful (yet sometimes very painful) Kaggle experience since joining the data mining competition platform. Coming from a rather traditional engineering background, data science was once like a complete myth to him. Joe will explain why participating in Kaggle is one of the most effective ways to kick-start a data science career. He will also explain how he used H2O for two Kaggle competitions: Rossmann Store Sales (2015) and Santander Product Recommendation (2016).
View slides here: http://bit.ly/2lsrD3F
Jo-fai (or Joe) Chow is a data scientist at H2O.ai. Before joining H2O, he was in the business intelligence team at Virgin Media in UK where he developed data products to enable quick and smart business decisions. He also worked remotely for Domino Data Lab in US as a data science evangelist promoting products via blogging and giving talks at meetups. Joe has a background in water engineering. Before his data science journey, he was an EngD research engineer at STREAM Industrial Doctorate Centre working on machine learning techniques for drainage design optimization. Prior to that, he was an asset management consultant specialized in data mining and constrained optimization for the utilities sector in UK and abroad. He also holds a MSc in Environmental Management and a BEng in Civil Engineering.
In the second part of this talk, Abhishek will present his research in applying deep learning for time series prediction. He is focused on applying these new methods in the field of astronomy to light curves.
View slides here: http://bit.ly/2mLX4qF
Abhishek Malali is a Master’s of Engineering student at Harvard University specializing in Computational Sciences. He focuses on applying machine learning research to time series applications. Currently he is working on time series prediction on irregular time series using deep learning architectures.