Data Engineer Intern

Data Engineer Intern jobs at

Data Engineer Intern

Company Description
Founded in 2016, VoxelCloud is a Los Angeles-based leader worldwide in artificial intelligence (AI) analysis of medical images. Backed by Sequoia and Tencent. We help healthcare providers make better/earlier diagnoses and other clinical decisions.

Job Description
The Data team at VoxelCloud (Westwood, Los Angeles, CA) manages and maintains large-scale medical and healthcare data at the core of all our R&D activities. Reporting to Data Team Lead, the Data Engineer intern will participate in the acquisition and manipulation of massive datasets in multi-modal formats (medical images, text(EMR), etc.) on cloud storage. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up. The Data Engineer intern will support our software developers and machine learning engineers on product/research initiatives and will create an optimal data delivery pipeline that is consistent across ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems, and products. The right candidate will be excited by the prospect of optimizing or even re-designing our company’s data architecture to support our next generation of products and data initiatives.

  • Create and maintain optimal data pipelines to support machine learning research and development
  • Identify, design, and implement internal process improvements: automating data QA, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS/AliCloud big data technologies.
  • Build analytics tools that utilize the data pipeline to provide actionable insights into product utilization and operational efficiency.
  • Keep our data separated and secure across national boundaries both locally and on cloud storage.


  • Proficient with at least one object-oriented/object function scripting languages: Python, Java, C++, Scala, etc
  • Working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases (Postgres).
  • Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
  • Solid understanding of information retrieval, statistics and machine learning. Experience with Computer Vision and NLP is a plus.
  • Prefer 1+ years in big data and related technology (e.g. DFS); experience with high-performance and scalable distributed system.
  • Prefer experience with AWS cloud services: EC2, EMR, RDS, Redshift
  • Skillful with automation tasks, but willing to get hands dirty for quality control.
  • Detail-oriented, well organized and self-motivated with a continuous drive to learn, explore and challenge; good communication skills and team player.
  • Experience supporting and working with cross-functional teams in a dynamic environment.
  • MS, BA/BS degree in computer science, statistics or related field.

Additional Information*
We Offer…*

  • An outstanding start-up culture;
  • Transparent, collaborative work environment;
  • Competitive compensation
  • Excellent Medical, Dental, and Vision coverage
  • 401k, paid Vacation and Holiday

All your information will be kept confidential according to EEO guidelines.

Location: 1085 Gayley Avenue, Los Angeles, CA 90024

Job Type: Full-time


Published 2022-06-10

looks for a Data Engineer Intern in Los Angeles, CA 90024 location.

How much Data Science Internship professional earns at ?

The proposed salary for Data Science Internship position at is Not disclosed.

Data Engineer Intern responsibilities at

According to the common industry standard, Data Science Internship specialist is in charge for the following activities.

Required Data Science Internship qualifications at

To be disscussed with candidate