Welcome to our blog.

← to

Apples and Oranges: Benchmarking Scailable

Contributed February 3rd, 2020.

In one of Susan Li's "Toward Data Science" blog posts (here), she shows how to train classifiers on a fruit classification task with Python's Scikit-Learn. In a new demo, we demonstrate how Scailable allows you to put Susan's models into production with just one additional line of code:


Intrigued? Continue here to find out how to put your fitted models into production with just one line of code!

Going for a ride: Scailable for automatic property valuation

Contributed December 19th, 2019.

We are happy to share a demo showcasing one of our public REST endpoints. You can find the interactive demo here. The demo front-end consumes the following Scailable REST endpoint:

This REST endpoint generates inferences (predictions) of ask-prices for houses. Effectively, you can submit a JSON object in the body of a POST request to this endpoint containing the description of the property you wish to automatically valuate, and you will receive a JSON object back that contains 200 predicted prices allowing you to compute (e.g.,) the expected value or a reasonable credible interval.

Note: Please note that the REST endpoint is only accessible through POST requests and not via a browser GET request.

Below we detail how this demo was built on top of our platform. If you are just interested in the performance gains achieved by putting this machine learning task into production on the Scailable platform, please scroll directly to the performance comparisons.

The dataset

The automatic property valuation model was fitted on a dataset containing the ask-price of 5606 houses listed for sale in the ten biggest cities in the Netherlands on November 15, 2019. The cities are:

  1. Amsterdam (AMS)
  2. Rotterdam (ROT)
  3. The Hague (DHA)
  4. Utrecht (UTR)
  5. Eindhoven (EIN)
  6. Tilburg (TIL)
  7. Almere (ALM)
  8. Groningen (GRO)
  9. Breda (BRE)
  10. Nijmegen (NIJ)

The dataset itself can be downloaded as a CSV here.

The dataset contains the following features:

  • price The asking price.
  • volume The volume in m3.
  • yard The size of the yard in m2.
  • area The living area in m2.
  • rooms The number of rooms.
  • sleeping The number of bedrooms.
  • iso Isolated? 0 = no, 1 = yes.
  • hood Neighborhood name (Note: not used in subsequent analysis).
  • garden Garden present? 0 = no, 1 = yes.
  • balcony Balcony present? 0 = no, 1 = yes.
  • garage Garage present? 0 = no, 1 = yes.
  • age Age of the building in years since 2019.
  • openh Fireplace present? 0 = no, 1 = yes.
  • paint Recently painted? 0 = no, 1 = yes.
  • days Posted in days since 15-11-2019.
  • m- Dummies for the different cities, see above.
The prediction model

For demonstration purposes, we trained a BART model using the following specification (the code is the [R] code used to open the dataset, fit the model, and subsequently save it):

# Import BART library

# Load file
df <- read.csv(file = "houses-clean.csv")

# Get input for BART model:
y <- df$price
X <- df[,c(-1)]

# Ignore neighborhood:
X$hood <- NULL

# Fit BART model
bart_model <- wbart(     X,                              # x.train
                         y,                              # y.train
                         ndpost = 1000)                  # number of kept draws

save(bart_model, file = "bart_model.RData")

# Fit thinned BART model (save space)
bart_model <- wbart(X,y,ndpost=1000, nkeeptreedraws=200) 

save(bart_model, file = "bart_model_thinned.RData")

Generating predictions using the fitted model bart_model above takes a few seconds on my local machine. However, putting this into production is tricky: this would require running [R] on a server, and setting up some way of interfacing with [R] through REST calls.

Moving to production with Scailable

Using the Scailable platform, we were able to put the above model into production in a highly efficient way - thereby saving valuable development time for data scientists and providing a robust, production-ready implementation. The three-step process is simple:

Step 1: Porting to WebAssembly

First, we used one of our toolchains (currently in private beta) which takes the fitted model object (the BART model from the code above) and compiles it to WebAssembly. The resulting WebAssembly executable can be downloaded here. Note that our toolchain ensures proper I/O and memory management for the resulting WebAssembly executable.

(If you are interested in porting your own BART models to WebAssembly using our private beta toolchain, send us a message).

Step 2: Uploading to the Scailable Platform

Next, we uploaded the WebAssembly executable to our platform. This automatically generates a REST endpoint satisfying whatever consumption needs you might have. The open REST endpoint generated here can be found at:

Note that navigating to the endpoint in your browser is uninteresting to say it mildly; the whole thing comes to life once you start posting JSON objects to it. If you don't know what that sentence means, please check out the interactive demo.

Step 3: Scailable consumption

The interactive demo effectively just POSTs a JSON object that looks like this to the REST endpoint:


The endpoint returns an array of inferences (200 in this case), therein providing the automatic validation, including its uncertainty.

Note: One of the cool features of the Scailable platform is that we can run tasks either server-side or client-side: you can explore both options in the demo!

Performance increases

While the demo itself might not be visually spectacular, what is most interesting is what happens "under the hood." Putting the BART model into production for consumption on the web (or mobile devices or the like) using the Scailable platform saves your Data Science team valuable development time of . Time that can now be used to build new models.

However, Scailable does not just speed up human tasks: our method of deploying AI and ML models using WebAssembly is itself highly efficient. Let's compare a standard [R] implementation to the scailable implementation:

  1. The model specification in the bart_model consumed 39.1 Mb of memory. The Scailable WebAssembly task, the bart.wasm, consumes only 3.3 Mb.
  2. The minimal runtime size to execute the model is, after trying very hard for our [R] implementation to reduce the size of the runtime, about 80 Mb. A Scailable WebAssembly task can be run in a runtime that is less than 10 Mb.
  3. The [R] implementation takes about 4.5 seconds to generate the inferences. The Scailable task can be executed as a REST endpoint in less than a second.

So, we reduce the memory footprint of a typical data science task by more than a tenfold, and we speed up the process of generating inferences more than 5 times for this specific case. We are experiencing these types of performance increases to be quite typical.

Wrap up

We hope this blog post offers you a bit more background on our Scailable platform. If you want to know more, do let us know. Next time (which is likely next early 2020) we will provide some more details regarding the production platform that manages the execution of Scailable tasks.

Thanks for your interest!

(Ow, and, disclaimer. While the automatic valuation model presented here was trained on real data and its results are somewhat sensible, it is also very, very simple. For a more precise estimate, we would certainly include more features.)

The Scailable platform: Scaling machine learning in production

Contributed November 6th, 2019.

In our previous post, we explained how moving from vanilla BART using the implementation shared here to WASM greatly reduced storage needs and increased prediction speeds for generating house-price predictions. We are finding similar size improvements for many prediction tasks. Subsequently putting the optimized scailable-task into production on our platform allows the task to be executed at scale.

Wait, how does this platfrom work?

So, how does this platform work? How can you be cheap, fast, robust, and secure?

Effectively, we provide a serverless interface to our scailable-tasks. Each task is a fully optimized WASM executable that can be consumed by posting the input data to the compute-task to the appropriate REST endpoint. When a task is consumed, our platform (see below) matches, in real-time, compute demand with compute supply. Overview Scailable platform

This means that your task, according to your specifications, is sent to a machine (cloud, edge, or otherwise) that can execute your task within your desired parameters. Do you want to be extremely fast? We can carry out the task on our own dedicated compute-nodes that keep the WASMs in memory and are extremely fast compared to other serverless solutions (in part because we do not need to spin up large processes such as a full NodeJS process or python process as is common in AWS and in part because we do not need to repeat the [validation]( step of running the executable). Because the tasks itself are highly optimized, our serverless compute-nodes are also cheaper to put into production than many other existing solutions.

And, its not just speed and efficiency.

However, if speed is less of an issue, we can do even more: we can push your computation to the edge. We have thousands of, what we call, web-nodes available that can execute your task for a fraction of the cost you would incur at other serverless platforms.

And that's not all. Our Scailable core can do more to make sure your compute-task is executed exactly the way you want it. What if your task is analyzing an image stored on a mobile device, or crunching data stored on a cloud machine? Well, if sending the data to the task consumes more network traffic than the other way around, we will (if possible) send the WASM executable to the task-consumer. We use location, task, and input size and match these directly with the available compute supply. And, we scale "out-of-the-box".

Putting data science and machine learning models and data processing into production can be hard. And, it can be costly (and even worse, the actual costs incurred can be very hard to judge with various serverless compute providers). Our platform solves these issues: efficient large scale deployment of machine learning tasks is made easy and affordable.

More comming up; let us know if you are interested!

If you are interested please do let us know; send us a message at We will be following up shortly with some of our client success cases and more details regarding the compute-nodes: we spend a lot of effort making sure that scailable-tasks can run at with extremely high throughput speeds and without the need to start up additional containers for each job. Cool stuff. More next time.

How fast predictions scale business processes

Contributed November 1st, 2019.

Well-trained machine-learning models have become key to automating business processes. Nowadays, insurance companies use machine learning to identify fraudulent claims. Hospitals use models trained on imaging data to automize diagnosis. And machine learning outperforms real estate agents in valuating commercial property. In each of these cases, machine learning allows for automatization, the scaling up of business processes, and, ultimately, better service at a lower cost.

Bayesian Additive Regression Trees

Different models are fashionable at different times. Though deep neural networks are currently extremely popular, Bayesian Additive Regression Trees (shorthand "BART") models have been gaining more and more traction.

From a technical point of view, BART models provide an extension to simple regression trees in various ways:

  • The sum of trees nature of BART allows the model to easily capture higher-order interactions between variables, providing a very flexible response surface that can fit many real-world situations without the need for elaborate feature engineering.
  • The Bayesian nature of BART allows the model to quantify the uncertainty of its predictions in a natural way. While with other models, we often have to resort to bootstrapping (or asymptotic methods), BART accurately quantifies uncertainty "out-of-the-box."

And it is precisely this quantification of uncertainty which makes BART models so appealing: in many business problems, knowing when you are unsure is better than being overly confident - but wrong - in your answer. For more on BART, please see here and here (the latter in part by members of our team).

Efficient property valuation using Scailable-tasks

At Scailable, we have applied BART to various tasks; for example, we used BART to predict house prices in different regions of the Netherlands. Here, our BART model's quantification of uncertainty proved a perfect fit for the automation of real estate bidding.

But while the BART model is functionally awesome and can be applied to a great many prediction problems, evaluating the BART model in production often proves cumbersome. Fitting a BART model (for instance with the OpenBart package found here) is generally no problem. But putting the resulting model in production then would require running [R] on your server, which can be tricky. Adding to the pain, when fitting a model predicting house prices in different cities in the Netherlands, we ended up with a stored model of over 30 Mb. And the generation of predictions required multiple seconds as [R] needs to interface with C++ code, and the large model needed to be loaded into memory again and again.

This all changed when we took our fitted BART model and ported it to our Scailable platform. We could now use one of our toolchains to port the fitted [R] model to Web Assembly (WASM), and, subsequently, put the Scailable-task into production on our platform.

The WASM executable storing the full model specification and its prediction logic ended up being < 200Kb. Furthermore, generating a prediction by consuming a Scailable REST endpoint took less than 1/10th of a second for the round trip. Optimization on our platform saved time, compute, and eventually money.

The property valuation Scailable-task now allows our clients to generate high-quality house valuations automatically within a fraction of a second, and without having to worry about the implementation details. Predicted house price

The figure shows a prediction for a house in Nijmegen, The Netherlands. The red vertical line indicates the point estimate (i.e., the result that you would get from your deep neural network). This point estimate of 402k Euro is very close to the estimated 395k provided by a professional real estate agent. However, the figure also clearly shows the uncertainty surrounding this estimate; uncertainty that might considerably affect a purchase decision.

In our next post, we will discuss how we make sure that efficient WASM prediction tasks can be executed at a large scale in our Serverless environment. And if you are looking for ways to bring your machine learning models into production or to (further) optimize your predictions, do contact us at!

Created in 2019 with ♥. For questions, contact us at go [at] scailable [dot] net.