Distributed Computing Software engineer
AOL R&D in Palo Alto, CA is looking for a talented Distributed Computing Software engineer to join our Large Scale Analytics (LSA) team. In this role, you are instrumental in transforming research concepts into prototypes and software products. We value attitude, aptitude, communication skills, and coding skills over experience with specific languages and environments.
Large Scale Analytics (LSA) is primarily a research group with development of prototypes to prove new and advanced data analysis algorithms. Team member work with very large amount of data (2.5 - 3 billion records per day) to help discover and prove new advanced data analytic algorithms for surfacing unique methods for optimizing statistical model enabling prediction and personalization analysis. The end goal is to improve online advertising campaigns targeting thus maximizing revenues. In addition to utilization of distributed computing technologies, proving newly developed algorithms, work involves architecture and design to incorporate these new algorithms into new products.
This position: Performs research and iterative prototyping with large scale distributed computing and distributed database systems architecture; Utilizes experience with distributed file systems, database architecture, and data modeling to organize and process large data sets; Develops software to support data mining projects and contextual analysis, such as crawling, parsing, indexing, and unique content analysis; Collaborates with scientists and analytics solution architects to design distributed data storage and processing services that are scalable, reliable, and available. Identify potential performance bottlenecks and scalability issues to justify or critique the design of new algorithms; Assists researchers with accessing and processing large amounts of data.
This is an exciting position that requires:
- Research, analyze and convert large amount of raw collected data and content into new sets of data that is structured and does not reduce data context in order to enable the Productization of new products;
- Work with data warehousing and distributed/parallel processing of large data sets using parallel computing system to map/reduce computation and Linux clusters (e.g. Hadoop/Cloud technologies, HDFS); cluster;
- Work with modern development methodology such as Agile, Scrum and SDLC;
- Ability to work in a research oriented, fast pace, and highly technical environment;
- Knowledge in distributed system design, data pipelining, and implementation;
- Knowledge and experience in building large scale applications using various software design patterns and OO design principles;
- Expertise in design pattern (UML diagrams) and data modeling of large scale analytic systems;
- Master’s degree in Computer Science;
- Quick thinker and a fast learner;
- Collaborative spirit;
- Excellent communication and interpersonal skills;
- At least 3 years of software development experience;
- At least 2 years of experience working with distributed systems;
- Experience with either distributed computing (Hadoop/Cloud) or parallel processing (CUDA/threads/MPI)