Full Time – Toronto, ON
About the Company
At Riskfuel, we have pioneered the use of Deep Learning to accelerate financial models used by capital markets and insurance companies. Our models are millions of times faster than anything else on the market today. We are doing exciting cutting-edge research and our technology is winning major industry awards. The work is varied, interesting and fast paced, with lots of opportunity to make impactful contributions. We are looking for talented individuals to work and learn alongside colleagues who are leaders in the field. See more at our website Riskfuel.com.
About the Position
Riskfuel has a unique set of challenges around Data Management. We run the client’s slow financial pricing models with millions of simulation inputs in order to generate the training data that we need for training our neural networks. We use a variety of strategies to generate these simulation inputs and then we distribute the calculations across thousands of nodes in our Kubernetes Clusters to generate a unique trade database for each model. This trade database becomes the training data that teaches our Neural Networks to approximate the client’s models very accurately and very quickly.
This role is primarily focused on the Microservice Architecture and data management strategies used in our data generation pipeline. We’re developing features to be smarter around our pricing request generation and more efficient with cluster compute resources. At the same time, we are continually improving our overall data storage and management strategies.
Here are some of the things you’d be working on:
- Creating simulated model inputs, considering the statistical distribution of input values and the downstream effects that might have on ML Training results.
- Creating wrappers around client models, allowing us to employ advanced strategies around how we run simulations with the input data (eg: Run Once, Dynamic Result Convergence, Run preset variations per input)
- Creating wrappers around input generators, using various mathematical strategies to generate inputs with more useful distributions to better train models on final results
- Writing processing scripts to create training sets with specific properties
- Architecture, design, and development to convert research “Proof of Concept” strategies into scalable workflows that can handle hundreds of millions of datapoints
- Adding new features around our Machine Learning code, including new Network Architectures, Data strategies, etc.
- Performance Profiling and Optimizations
About the Candidate
Fundamentally, we need you to be very proficient with Python and to understand software architecture principles. You should also be comfortable working with a Linux environment. Beyond that, having experience with the items in our tech stack is definitely an asset but not strictly required. In order of importance:
- Docker, Containerization
- Apache Pulsar, Apache Kafka, Event Streaming
- Object Storage (AWS S3, Minio, etc.)
- ArgoCD, Apache Airflow
- Azure, AWS, GCP
If you like diving into new technologies and aren’t afraid to pick up the occasional academic research paper, this is the job for you!
How to Apply
To apply to this position, please provide a resume as well as any additional information that demonstrates your experience. We look forward to reading your application.