Editor's Guide: As one of the two business email list most important promotional activities in the year, 618 will involve various departments and systems. As a task scheduling product manager, how to carry out work and escort 618? The author of this article has analyzed this and shared it with you.
618 is a big day for e-commerce, and people from all walks of life show their magical powers. As the friends of the middle office system, I am busy with various tasks in the "shameless middle and back office". We uncovered its mystery and explored how these "underground" workers escorted 618, how to make the millions of cold servers work together, support petabyte-level data operations, and ensure tens of billions of orders , the achievement of a GMV of 100 billion level...
The story starts from the "scheduling platform", the core link of the big data platform. Task scheduling is a heavyweight product for offline computing of the big data platform. It not only carries the synchronization work between various databases and data marts, but also carries various off-line data computing work. The main application scenarios are data management, handling, computing, and storage.
Currently, task scheduling supports a variety of task types, including: common tasks, data computing (py/sh/zip), data inbound tasks, data outbound tasks, data zip tasks, and data synchronization (JDW to Jmart).
Data calculation (py/sh/zip): Scheduling can support multiple script types such as python, shell, jar, etc., providing powerful computing power and timing functions to support data analysis operations.
Warehousing tasks: Currently, task scheduling supports data extraction from various data sources such as MySQL, HBase, ElasticSearch, Oracle, mongodb, SQLServer, log, and phoenix to the bdm layer of the data warehouse.