Performance Analysis Concept
Measure performance and take appropriate actions. We made test and share our setup and results.
Introduction
Thinking about performance analysis there are certain things you need to consider and prepare upfront.
Define Your Expectations
Your expectations should be based on a characteristic load profile. That includes how many concurrent users you expect to use your application as well as on how many tenants you expect to distribute the load. When you know how many users you expect you need to make also a baseline test (one user) to find the mark that defines the absolute optimum, the best possible result of any test under the expected load. It can't get any better than that.
Define Available Resources
Can you tell how many resources are available for your application? This has to be defined, so that you can evaluate, by doing performance analysis, if the resources are sufficient for normal load, or if you need to optimize things, or even if these resources are sufficient for an overload situation. So, if not already done, set the stage with regards to cluster, memory, and speed of CPU.
Know Your Goal (KPI)
Is your goal to get to the edge on requests per second or have a response of under one second? Answering this question, together with your expectations and resources, gives you the guidance for your tests. With all this information at hand, you're prepared to start with this guide and analyze the performance of your application.
Initial Measurement
You can find similar graphics throughout the guide. The active as well as the isolated components are highlighted, to emphasize the content of that section. All inactive components are displayed in light gray.
The first step of your tests is the biggest one, the full monty. Deploy your application in a productive setup into your target landscape/environment. Why? Because this test has the potential to show, that your KPI is fulfilled without the need to do further optimizations.
For our tests, we used a data model consisting of one entity with 25 fields.
entity IACT0002 : managed {
key FLOWUUID : UUID @odata.Type : 'Edm.String';
VAN_RECV_DATE : Date;
USER_ID : String(20);
USER_NAME : String(20);
USER_ORGEH : String(14);
USER_ORGTX : String(40);
SLACK : String(20);
CARDNO : String(16);
APPR_NUM : String(10);
ACQ_CLASS : String(1);
COLL_NO : String(50);
USEDAT : Date;
USETIME : Time;
BUS_NUM : String(20);
P_USAGE : String(60);
P_PHONE : String(20);
P_ADDRESS : String(100);
TAXTYPE : String(20);
APPR_AMT : Decimal(15, 2);
AMOUNT : Decimal(15, 2);
WMWST : Decimal(15, 2);
TIP_AMT : Decimal(15, 2);
ENU_AMT : Decimal(15, 2);
ACQU_FEE : Decimal(15, 2);
WAERS : Currency;
INDU_CODE : String(10);
INDU_CLAS : String(50);
ABROAD : String(1);
STATUS : String(1);
N_FLOWUUID : UUID @odata.Type : 'Edm.String';
FLOWCODE : String(5);
FLOWNO : String(10);
FLOWCNT : String(3);
FLOW_START : Boolean;
LIFNR : String(10);
MESSAGE : String(100);
}Deploy Your Application
Before deploying your application, make sure that the resources that you want to use are provisioned in your target landscape/environment. For SAP BTP the provisioning is mostly done using the cockpit, even though there's also the SAP BTP CLI, which you can use to administer your account, orgs and spaces in the Cloud Foundry environment.
To deploy, you can use an MTA Deployment or a manual deployment of your applications using cf push, which would mean for you to orchestrate and bind all services to your applications.
Run Load Tests
Danger!
TODO
- Describe Autocannon
- Also mention challenges, dos and don'ts if available
Isolate Components
If your KPI isn't met during the full deployment, it's time to investigate who is consuming resources or influencing throughput. Do this by isolating components one by one. It's important to always isolate only one component at a time, because otherwise you can't identify the component you need to optimize. In exceptional cases it can make sense to make bigger cuts through your setup to group components and then, in these clusters, isolate the components one at a time.
To start from the core and explore if there's already impact on the smallest possible unit, let's start with the application itself.
Baseline Test
In each of the following test setups, you first make a baseline test before you run load tests. Simulate one user, sending one request and take the result as the baseline for any other load test.
CAP Service Layers
Let's start with the CAP application and compare it to the fastest competitor we're aware of. In this step, you eliminate any other potential bottleneck influence, and compare a CAP-based REST service with the minimalistic pure Express.js equivalent. This test is run locally, to eliminate the network or the components of SAP BTP as potential influencers. Also we don't use a database and serve the data from in-memory.
- GET express/ad_hoc (= without CAP) vs GET rest/ad_hoc (= with CAP)
- Use 8 workers, run locally, no database, no network, no cloud platform.
| Requests | CAP | Express.js |
|---|---|---|
| $top 1 | 50000 | 50000 |
| $top 10 | 50000 | 50000 |
| $top 100 | 50000 | 50000 |
Tip
In our test, we had two key findings:
- CAP performs largely on a par with plain express.js.
- Numbers of rows have a major impact on response times (also with plain express.js).
As we found, that in that setup there's no measurable impact, let's see if changing the protocol has an impact, keeping the rest of the setup stable.
OData Protocol / Library
In this step, the setup is stable with regards to the applications. We used the CAP application and the Express.js application with the REST adapter. Also, we're still running locally, eliminating the same dependencies as in the previous test. Now, we compare those results to a CAP Application using the OData adapter, more precisely using the Node.js OData library used with CAP.
- GET express/ad_hoc vs GET rest/ad_hoc vs GET odata/ad_hoc
- Use 8 workers, run locally, no database, no network, no cloud platform.
| Requests | OData | CAP | Express.js |
|---|---|---|---|
| $top 1 | 9000 | 50000 | 50000 |
| $top 10 | 6000 | 50000 | 50000 |
| $top 100 | 1500 | 50000 | 50000 |
Warning
The key finding was:
The Node.js OData library provides more functionality than CAP needs. This comes with some performance penalties.
Danger!
TODO @Daniel/Sebastian: Add a sentence on what we recommend. In our sync, you said we're working on sth that just uses the OData functionality we need. Can we already tell people that?
There's no difference between a CAP application and a plain Express.js application, given that the REST protocol is used. As we identified the OData library as not useful for our tests, we go on testing using a CAP application with the REST adapter. The next step touches the target landscape of the application. In our case, as well as most probably in yours, this target landscape is SAP BTP Cloud Foundry environment.
SAP BTP
You can run the same load tests as above with the CAP application deployed on SAP BTP. We used 8 instances, 1 GB of memory each, without any service bound to the application.
To eliminate the network latency as effect, we deployed the client on Cloud Foundry.
| Requests | CAP | Express.js |
|---|---|---|
| $top 1 | 13000 | 19000 |
| $top 10 | 10000 | 15000 |
| $top 100 | 4000 | 5000 |
Tip
In our tests, our key findings were:
- CAP performs in the same range as Express.js
- Impact for rows served becomes much more significant.
- SAP BTP (that is, routers, ...) limits the maximum throughput.
Using SAP BTP Services
This test is the first time we add a database (SAP HANA) to our test. We assume to see an impact. The question is, if we think about performance budgeting, if this impact is on a level that is expected or if it's beyond. Also we add authentication and don't have the expectation to see much of an impact here.
To eliminate the network latency as effect, we deployed the client on Cloud Foundry.
Danger!
TODO @Sebastian: The description regarding the JWT token needs your refinement.
Run the load tests with authentication switched on and off, to isolate the impact of token validation.
| REQ | REST w/o token | → w/ token | factor |
|---|---|---|---|
| /ad_hoc?$top 1 | 20100 | 13500 | ~x1.5 |
| /ad_hoc?$top 10 | 16200 | 10200 | ~x1.6 |
| /ad_hoc?$top 100 | 4800 | 4100 | ~x1.2 |
| /from_db?$top 1 | 4350 | 4150 | ~x1.1 |
| /from_db?$top 10 | 4950 | 4300 | ~x1.2 |
| /from_db?$top 100 | 1650 | 1550 | ~x1.1 |
Warning
Key Findings: Token validation has a significant impact in terms of performance budgeting.
Danger!
TODO What would be our recommendation? Is there any?
Now, add SAP HANA database and compare that to the numbers obtained for CAP-based REST service without database.
Danger!
TODO Was this test with or without XSUAA? From the results I'd guess, w/o XSUAA. Then I need to correct the graphic.
| Requests | CAP w/o SAP HANA | CAP w/ SAP HANA |
|---|---|---|
| $top 1 | 13000 | 5000 |
| $top 10 | 10000 | 4000 |
| $top 100 | 4000 | 1400 |
Warning
Key findings: SAP HANA (Cloud Edition) limits max throughput.
Danger!
TODO Would we want to recommend PG? Other than that, there's no alternative, right?
Using Locale-Specific Sorting
We saw in the previous test an impact of SAP HANA that exceeds the usual expectations for a database. It's agreed, that a database always comes with a cost regarding performance budgeting, but here we had the urge to look at optimization. A trigger was to test the impact of using specific parameters with SAP HANA and we found, that, for example, locale-specific sorting has a huge impact. Locale-specific sorting is by default enabled when the usage of SAP HANA is combined with an authentication library. For those cases, we recommend testing if your customers are happy with less functionality in exchange with better performance.
So, here's what we tested: CAP application deployed on SAP BTP, with 8 instances having 1 GB of memory each. Use the token validation and SAP HANA.
Now, switch off locale-specific sorting, not using SAP HANA's with parameters clause.
| Requests | CAP w/o | CAP w/ |
|---|---|---|
| $top 1 | 4400 | 320 |
| $top 10 | 4200 | 250 |
| $top 100 | 1500 | 260 |
Warning
Our key findings are:
- Locale-specific sorting has a dramatic effect of up to factor 16.
- Don't use locale-specific sorting when throughput is important!
Scale-Out
We analyzed the impact of scale-out for increasing throughput in peak scenarios by increasing the numbers of instances in the cloud (both comparing CAP rest requests with SAP HANA).
| Requests | CAP x8 | CAP x16 |
|---|---|---|
| $top 1 | 4300 | 8750 |
| $top 10 | 3400 | 6750 |
| $top 100 | 1440 | 2880 |
Our key finding:
Tip
Scale out works as expected. Doubling the number of instances improves throughput by factor 2. Scale-out is and remains the major and recommended approach to address peak scenarios.
Clustering
Finally, we analyzed the impact of Node.js clustering, that is, spawning several workers within a 1-GB instance on Cloud Foundry (all with CAP REST requests with SAP HANA).
| Requests | CAP 1 worker | CAP 2 worker |
|---|---|---|
| $top 1 | 4300 | 5600 |
| $top 10 | 3400 | 4900 |
| $top 100 | 1440 | 2250 |
Warning
Our key findings are:
- Two workers show only moderate effects of ~ 20% improvement.
- Out-of-memory occurred frequently with >2 workers (single tenant).
Clustering doesn't seem reasonable as an option for optimization.
Actions to Optimize
Given the previous analysis and findings we recommend the following optimization measures. With these measures, you're able to reach: ~4000 requests / sec with 8 instances of 1 GB each, assumed the eventually seen mix of requests averages in the range of $top 10 requests
- Switch from OData to REST for most critical requests
- Switch off Locale-specific Sorting
- Leverage Scale-Out