System requirements

Please design and develop a logging system. It's a central application that one can find and view logs generated by the target application.

We can use this logging system to:

  1. collect logs

  2. search logs (based on time and logging level)

  3. view logs

Problem to resolve

FOR JUNIOR engineers:

  1. How to collect logs (sync/async ? poll or push ?)
    Sync: how to process failures of sending logs ?

  2. For different log formats, how to resolve ? (design pattern)

  3. Do logs need to persist ?
    limited to Database related, no ELK etc.
    Table design, primary key(sequence or uuid ?), log content too long, etc

  4. What APIs to expose ?
    RESTful, Data structures, HTTP status, input validation/Error handling

  5. How to search quickly ?
    Index, note logs are time-series data

  6. How to deal very stale data ?

FOR SENIOR/EXPERT engineers:

  1. In a distributed application, how to improve the performance of logging collecting ?
    Messaging queue

  2. Messaging: How to avoid duplicate logs if messaging queue used ?

  3. Scaling: How to find the performance bottleneck ? how to scale out ?
    add service instances, load balancer
    How to improve database performance ? (read-write separation or sharding (consider its time-based data)

  4. High availability (fix single point failure)

  5. How to deal very stale data in a distributed system ?