Bing Web Data team is looking for experienced platform and Big Data engineers to help us build the next generation platform for Bing. This is a great opportunity for someone who loves to tackle deep technical challenges and strive for customer impact. An ideal candidate would be someone who is expert in C++, C#, or Java with experience in platforms and/or Big Data (Hadoop, Cosmos, BigTable, etc.), and strong interest to learn and achieve more in these areas. We are currently building an entire new system for web search indexing backend that will be in order of magnitude larger and faster than anything that currently exists. The new indexing pipeline is built on top of a petabyte scale table and incremental processing infrastructure. The goal of the system is to process billions of documents a day with seconds to minutes E2E latency. To design the new system, we are applying a combination of approaches from the fields of distributed computing, network programming, algorithm optimization,
Our core set of engineering challenges include • Discover, Crawl from the Web, and Index 100s of billions of documents in a fast and efficient way. • Build a Document Understanding platform that partner teams can use to plugin to run their classifiers, models and allow friction free experimentation. • Provide Web Graph / Index as a service for Microsoft teams, utilize and build rich applications. • Parsing and classifying billions of web documents, do encoding and language detection, script segmentation and sentence breaking. • Use NLP, ML, DL techniques to extract document data, NER, POS tagging,
Required skills: - CS BS degree or equivalent - Strong knowledge of C/C++/C#/Java with at least 5 years of programming experience - Ability to perform independent research and work with data - Strong design skills, and engineering excellence fundamentals - Experience in writing multi-threaded code - Excellent debugging skills Desired skills and experiences: - 10+ years of programming experience - Experience in distributed systems - Working with data sets, and Big Data - Shipping service-based applications - Acting as tech leads or project leads