SaFi Bank Space : Caching of customer pubkey in IAM BE

Background

Currently, iam-manager’s /credential/public-key/{credentialId} call fetches the public keys from VIDA via /api/v1/device/fetch-public-key. This takes quite a bit of time and dominates the call from the app to a service. For example:

The call can be executed individually providing a trace ID:

$ curl -X 'GET'   'https://iam-manager.apps.brave.safibank.online/credential/public-key/76932dcc-bcbf-4b5a-8ea2-77122db8304c'   -H 'accept: application/json' -H 'traceparent: 00-a72c22771794443190d85d08f22b4890-1234567890123456-01'
{"silentKey":"MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEiOI7Rkj4ddlvFBN7K1MoyVqYZ6VRGgMqTGhT5YbdW/sgwbky0pXsJLTjQdgiGZKgEzZsivv8uQiB9NJottBK6Q==","presenceKey":"MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEQuSeB9Z+/FO1h6Tm6xDBe+VQ+eXXDhH67wcNXkJ1hBtUwJl3bw5etcCyMGal/rh2Qgi22E4SdcV5FB45fVW35w=="}

Goal

The main idea is to use an in-memory cache to cache the public keys to significantly lower the latency of this call.

Notes:

  • The public keys are not large, they are 125 chars long.

  • A simple Map or similar can be used.

  • Memory efficiency is not (yet) an issue / goal so please do no spend much time on this.

  • There is no problem with cache invalidation since we only cache the public keys based on credential ID and this is constant, can not be changed. If we want to assign a new public key to a customer we need to create a new credential ID and assign that. The credential ID- customer ID assignments are in the DB and we do not cache them.

  • The previous property means that we can have multiple instances, the only problem we could have is that we may need to fetch a public key once for each instance (the instances are not synchronized).

Acceptance Criteria

  1. In memory caching in iam-manager is implemented that reduces the latency to one digit ms range (preferably <1ms)

  2. An item is kept in the cache for an hour, then it is removed.

  3. The cache is protected for parallel (write) access (when adding new items to the cache).

  4. Basic statistics is logged in every 15 minutes

    1. number of items in the cache

    2. number of cache hits and cache misses (since the last log)

  5. This ticket is updated with trace screenshots proving the speed improvement

    1. trace1: first call for a credential (cache miss)

    2. trace2: second call for a credential (cache hit)

Appendix

https://medium.com/google-cloud/quality-of-service-class-qos-in-kubernetes-bb76a89eb2c6

https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/