We recently announced Addepar Trading, our new trading and rebalancing platform designed to help wealth advisors manage their trading operations with ease and efficiency. We designed the platform to support trading and rebalancing activities with no size or account limits, enabling our largest clients to trade seamlessly without worrying about capacity constraints. 

In this post, we will explore some of the scaling capabilities we’ve built into Addepar Trading, and the AWS services we use to meet these scalability goals.

Aspects of scale

Addepar Trading supports many workflows that facilitate portfolio construction and trade execution within a multi-tenant system. These workflows demand different scaling characteristics, such as:

  • The platform recommends trades to apply to a set of portfolios to achieve desired asset allocations. This process is inherently parallelizable, as rebalancing is applied per account (or household). A portfolio manager may also request multiple rebalances across many thousands of portfolios simultaneously. The platform is multi-tenant, and therefore supports multiple clients running multiple rebalances in parallel.

  • Model construction and configuration of compliance requirements happen less frequently and are less compute intensive, but require isolation from rebalancing and compute workloads.

  • Cross-tenant isolation is also important to ensure workloads submitted by one client have minimal performance impact on workloads executing for other clients. 

  • Order management requires strong consistency, isolation and resilience, coupled with high throughput but low compute requirements. It’s important that order management is isolated from other activity on the system that would otherwise consume resources and negatively impact the performance of the order management workflows.

Against this backdrop, we will discuss how we leverage AWS services to solve for some of these capabilities.

A deeper look at rebalancing

What is rebalancing?

Rebalancing a portfolio is the process of changing the weightings of assets in an investment portfolio by buying or selling assets. Rebalancing may occur in response to several factors, for example:

  • A change in the investment strategy 

  • Cash flowing into the portfolio that needs to be invested

  • Cash distributions requested from the portfolio

  • Values of the portfolio assets will change over time requiring adjustments in holdings to realign the portfolio with desired asset allocations.

Addepar Trading automates the activity of rebalancing by recommending the trades needed to rebalance a portfolio in each of these situations. The platform’s trade recommendations can be tailored to the firm's guidelines and clients' needs by incorporating compliance rules, client mandates and tax preferences into its calculations.

The rebalancing process

Rebalancing is triggered by a portfolio manager asking the system to run a rebalance. The manager may ask for many portfolios to be rebalanced simultaneously.

In this example the manager has requested a rebalance to model for 57 portfolios:

Our largest clients manage thousands of portfolios and the platform must support multiple managers requesting multiple rebalances at any time for any number of portfolios they have under management. In addition, the platform is multi-tenant, so we must consider scaling to many thousands of rebalance runs on a daily basis.

In order to meet these demands, we implemented the rebalance process asynchronously and leveraged several AWS services to meet our performance goals.

  1. When the user requests a rebalance run for a set of portfolios, the rebalancing service retrieves all of the data required for the rebalance in bulk from Aurora: the portfolios, their holdings, the target allocation models for the portfolios, the market data for the securities held in the portfolios and any restrictions and equivalents applied to the run. This operation is somewhat efficient as there is redundancy in the market data and models which are shared between portfolios and their holdings.

  2. The rebalancing service creates a single data package for each portfolio that contains the portfolio's holdings, the market data for each holding, and the model and compliance data for that portfolio, and writes these packages to AWS S3 (S3)

  3. The rebalancing service publishes a rebalancing message for each portfolio in the run to Kafka (AWS MSK), and marks the rebalance run as in progress.

  4. The rebalancing engines consume the messages from Kafka and execute the rebalance for each portfolio independently. The engine reads the rebalance package from S3, and writes its output to S3 — S3’s scale and durability allows us to scale the engines horizontally while minimizing the read activity on the database. This architecture also creates a robust and static audit trail of the rebalance inputs and outputs.

  5. The engines write the rebalance output — the recommended trades for the portfolio — to S3.

  6. The rebalancing service flags the run as complete when every portfolio in the run has completed.

Using S3 in this manner means we can capture a full audit for each rebalance run which includes the model and compliance inputs, the portfolio holdings, market data and the recommended trades. We use S3’s object lock feature to ensure this data is immutable to provide a robust audit trail even as the portfolio data in the database changes over time. S3 provides fully scalable access to these large datasets at a low cost. We also use S3’s lifecycle configuration capabilities to manage archiving and eventual policy-based deletion of this data without having to write custom code or run scheduled jobs.

More efficient testing

Another benefit of the stateless engine design is that we can test the rebalancing algorithms independently from the other platform services. The engine input is a set of JSON documents stored in S3, and the output is also a set of JSON documents written to S3. This enables us to create a rich set of test scenarios that we can inject into the engine and verify the output against expected results. These tests run as unit tests in the engine build process, but allow us to test the full functionality of the rebalancing engines without needing to run additional services or infrastructure.

Dynamic compute workloads

We use a service-oriented architecture that enables us to scale different parts of the platform independently. In the previous section we discussed how the rebalancing engines apply trading algorithms to determine which trades to recommend for a portfolio. We designed this process to be asynchronous and our algorithms are quick to execute — in the order of 500ms per portfolio — but the platform may need to rebalance many thousands of portfolios simultaneously in response to client requests. 

We implemented the rebalancing engines as stateless processes that we can scale using AWS Elastic Kubernetes Service (EKS). We send requests to rebalance a single portfolio using AWS Managed Streaming for Kafka (MSK) which are consumed by the engines. EKS dynamically scales the number of running engines in response to the demand for rebalancing, and this includes scaling the engines down when load on the platform decreases.

We trigger scaling events based on the Kafka queue depth, which will spin up more engines in response to a backup of rebalancing requests.

Cost efficiency

This EKS-based implementation allows us to isolate the compute-intensive rebalancing algorithms from the less resource-hungry workloads such as model construction and compliance policy configuration. These workloads tend to execute less frequently and require less compute resources, and we use EKS to provision these services using fewer pods with less CPU and memory resources. We can optimize the resource allocation to these services at a lower cost while still meeting our performance SLAs.

Scaling the database

We built the Trading platform atop three Postgres databases (one each for Rebalancing, Order Management and Investment Book of Record) because each of these domains requires strong consistency guarantees. We use AWS Aurora to scale each of these databases both vertically and horizontally, taking advantage of Aurora’s read replicas to optimize read workloads and automated multi-region replication to underpin our recovery strategy. AWS RDS also supports upgrading compute and memory resources with little downtime as our platform grows.

Tenant isolation

As well as scaling our services independently, we used AWS Aurora to implement the bridge partitioning model. In addition to the security improvements this offers over pool partitioning, this improves performance for our larger clients by reducing the amount of data held in each schema and reducing the complexity of our database operations.

Summary

We’ve discussed how we leverage some of AWS’s services to help meet our performance targets. However, we’re just at the beginning of our trading journey and we anticipate our clients growing their business and demanding more from our trading platform. We are ensuring that we can stay ahead of the demand curve with both responsive services and sophisticated load balancing to continue to provide a premium level of service to all clients, both big and small.