Earlier this year, Eurostat, the statistical office of the European Union (EU), hosted the fourth edition of the European Big Data Hackathon on the Amazon Web Services (AWS) Cloud. The hackathon is a satellite event of the biennial international scientific conference series, New Techniques and Technologies for Statistics (NTTS).
During the hackathon, teams from all over Europe compete in a statistical challenge. The teams develop innovative approaches, applications, and data products combining official statistics and big data. The aim is to extract insights that inform policymakers regarding pressing policy questions that Europe is facing. During this year’s event, the participants were asked to develop data analytics tools and build an early warning system using data from credit card transactions to alert of sudden changes in the economy relevant for EU policymakers.
Learn how Eurostat leveraged the flexibility and scalability of the AWS Cloud to allow 24 teams across Europe to compete and build a viable early warning system over four days.
Designing a hackathon to serve multiple user needs
To support the European Big Data Hackathon, Eurostat wanted the flexibility to create a temporary dedicated environment for the event without investing in permanent hardware and software for such a short time period. They opted to build a cloud solution, which would let them increase or decrease the size and number of servers as needed during the event, saving energy and costs. Based on existing contracts for the provision of cloud services, Eurostat decided to host the event on AWS, leveraging AWS data centres in Europe, as well as making use of the large variety of AWS managed services available, and the opportunity to simply set up and deploy multiple temporary environments through infrastructure as code (IaC) in the AWS Cloud.
Both being Directorates General of the European Commission (EC), Eurostat worked with the Directorate General for IT (EC DIGIT) responsible for the digital infrastructure of the EC, and Deloitte, an AWS Partner, for the design and delivery of the hackathon infrastructure, hosting open-source services in the cloud. To make the hackathon accessible to a large number of statisticians with different knowledge of cloud services and applications, they decided to put in place two types of development environments. Users less experienced with cloud services accessed a user-friendly, flexible environment with a pre-defined restricted set of data analytics tools; statisticians with more cloud experience used a powerful and less customised environment comprising of graphics processing units (GPU) for advanced machine learning (ML) models.
Building a big data hackathon on AWS
Using AWS, Eurostat and EC DIGIT quickly set up and tested the environments and created a flexible platform to serve the 24 competing teams. These teams continuously worked over four days with the provided anonymized data. Maintaining high availability, scalability, and the performance of underlying resources was essential during the hackathon event.
Amazon Elastic Kubernetes Service (Amazon EKS) allowed data science users to run familiar software without worrying about the underlying infrastructure and enabled them to use AWS Auto Scaling when they need to perform central processing unit (CPU) memory intensive extract-transform-load (ETL)/extract-load-transform (ELT) processes. At the peak of utilization, Eurostat ran 50 16xl servers. The use of Amazon Elastic Compute Cloud (Amazon EC2) instances granted users simple access to readily configured GPU instances. Monitoring tools like Amazon CloudWatch allowed Eurostat to save both costs and energy by not leaving resources running idle. To support statisticians finding innovative ways to turn data into insights, AWS funded the event hosting costs through AWS Promotional Credit.
As part of the AWS Infrastructure Event Management (IEM) service, an AWS Technical Account Manager (TAM) was ready to step in to support Eurostat through any issues—but Eurostat didn’t experience any. During the whole event, the system maintained high availability, users accessed the capacity they needed, the deployment of new machines occurred within minutes, and all the hackathon functionalities performed seamlessly for the competing teams. Throughout the event, Deloitte supported participants with their technical questions, capitalizing on their past experience organizing hackathons.
Designing the hackathon on AWS, Eurostat, EC DIGIT, and Deloitte developed simple-to-use solutions that supported participants in designing and implementing their machine learning use cases. Hackathon participants concentrated their efforts on applying their domain expertise without worrying about IT infrastructure.
Conclusion
Capitalising on the success of the event, Eurostat, together with the European Commission and National Statistical Institutes, is now looking for expanding and generalising this approach to accelerate the exploration of non-traditional data source and new methods in support to official statistics innovation.
To learn more about how AWS supports public sector customers, visit the AWS in the Public Sector website.
Read related stories on the AWS Public Sector Blog:
Subscribe to the AWS Public Sector Blog newsletter to get the latest in AWS tools, solutions, and innovations from the public sector delivered to your inbox, or contact us.
Please take a few minutes to share insights regarding your experience with the AWS Public Sector Blog in this survey, and we’ll use feedback from the survey to create more content aligned with the preferences of our readers.