Project status & roadmap
The Asterius project has come a long way and some examples with complex dependencies already work. It's still less mature than GHCJS though; see the next section for details.
In general, it's hard to give ETA for "production readiness", since improvements are continuous, and we haven't collected enough use cases from seed users yet. For more insight into what comes next for this project, we list our quarterly roadmap here.
Besides the goals in each quarter, we also do regular maintenance like
dependency upgrades and bugfixes. We also work on related projects (mainly
inline-js) to ensure they are kept in
sync and also useful to regular Haskell developers.
What works now
- Almost all GHC language features (TH support is partial, cross-splice state persistence doesn't work yet).
JSValtype in Haskell land.
- Cabal support. Use
ahc-cabalto compile libraries and executables. Support for custom
- A linker which performs aggressive dead-code elimination, based on symbol reachability.
- A debugger which checks invalid memory access and outputs memory loads/stores and control flow transfers.
binaryenraw bindings, plus a monadic EDSL to construct WebAssembly code directly in Haskell.
wasm-toolkit: a Haskell library to handle WebAssembly code, which already powers binary code generation.
- Besides WebAssembly MVP and
What may stop one from using Asterius right now
BigIntsupport at the moment.
- Runtime bugs. The generated code comes with a complex hand-written runtime which is still buggy at times. The situation is expected to improve once we're able to work with an IR more high-level than Cmm and shave off the current hand-written garbage collector; see the 2020 Q3 section for more details.
- GHCJS projects aren't supported out of the box. Major incompatibilities
- Word sizes differ. Asterius is still 64-bit based at the moment.
- JSFFI syntax and semantics differ. Asterius uses
Promise-based async JSFFI and GHCJS uses callbacks.
- Cabal handles GHCJS and Asterius differently.
- Lack of Nix support.
- Lack of GHCi support.
- TH support is not 100% complete; certain TH API which require preserving state
across splices (e.g.
putQ) don't work yet.
- Cabal tests and benchmarks can't be run out of the box.
Setup.hssupport is limited. If it has
setup-depsoutside GHC boot libs, it won't work.
- Lack of profiling support for generated code.
- Excessive memory usage when linking large programs.
For the past months before this update, I took a break from the Asterius project and worked on a client project instead. There's a saying "less is more", and I believe my absense in this project for a few months is beneficial in multiple ways:
- I gained a lot more nix-related knowledge.
- Purging the short-term memory on the project and coming back, this gives me some insight on the difficulties of onboarding new contributors.
- After all, it was a great mental relief to work on something which I was definitely not a bottleneck of the whole project.
Before I took the break, Asterius was stuck with a very complex & ad-hoc build system, and it was based on ghc-8.8. The most production-ready major version of ghc is ghc-8.10 today. Therefore, Q3 goals and roadmap has been adjusted accordingly:
- Upgrade Asterius to use ghc-8.10. The upgrade procedure should be principled & documented, so someone else can repeat this when Asterius upgrades to ghc-9.2 in the future.
- Use cabal & nix as the primary build system.
What has been achieved so far:
- There is a new ghc fork dedicated for asterius at
https://github.com/tweag/ghc-asterius. It's based on
ghc-8.10branch, the previous asterius-specific patches have all been ported, and I implemented nix-based logic to generate cabal-buildable ghc api packages to be used by Asterius, replacing the previous ad-hoc python script.
- There is a WIP branch of ghc-8.10 & nix support at https://github.com/tweag/asterius/pull/860. Most build errors in the host compiler have been fixed, and the booting logic will be fixed next.
- A wasi-sdk/wasi-libc fork is also maintained in the tweag namespace. It's
possible to configure our ghc fork with
wasm32-unknown-wasitriple now, so that's a good start for future work of proper transition of Asterius to a wasi32 backend of ghc.
Remaining work of Q3 will be wrapping up #860 and merging it to
Beyond Q3, the overall plan is also guided by the "less is more" principle: to reduce code rather than to add, leveraging upstream logic whenever possible, while still maintaing and even improving end-user experience. Many hacks were needed in the past due to various reasons, and after all the lessons learned along the way, there are many things that should be shaved off:
- The hacks related to 64-bit virtual address space. Reusing host GHC API which targets 64-bit platform for Asterius was the easiest way to get the MVP working, but given we have much better knowledge about how cross-compiling in ghc works, these hacks needs to go away.
- Custom object format and linking logic. This was required since Asterius needed to record a lot of Haskell-specific info in the object files: JSFFI imports/exports, static pointer table, etc. However, with runtime support, these custom info can all be replaced by vanilla data sections in the wasm or llvm bitcode object files.
- Following the entry above, most of the existing wasm codegen logic. It looks possible to leverage the llvm codegen, only adding specific patches to support features like JSFFI.
cross-compiled ghc rts for the wasi32 target, component after component. The
modules which work in runtimes beyond browsers/nodejs (that's why we stick to
emscriptenin the first place).
In 2020 Q4 we mainly delivered:
- Use standalone stage-1 GHC API packages and support building Asterius using vanilla GHC.
- Remove numerous hacks and simplify the codebase, e.g.:
ahca proper GHC frontend exe, support
ahc -con non-Haskell sources
- Use vanilla archives and get rid of custom
- Refactor things incompatible with 32-bit pointer convention, e.g.:
- Proper heap layout for
- Remove higher 32-bit data/function address tags
- Proper heap layout for
In 2021 Q1, the primary goals are:
- Finish transition to 32-bit code generation.
- Improve C/C++ support, including support for
cbitsin common packages.
The plan to achieving above goals:
- Audit the current code generator & runtime and remove everything incompatible with 32-bit pointer convention.
- For the time being, favor simplicity/robustness over performance. Some previous optimizations may need to be reverted temporarily to simplify the codebase and reduce the refactoring overhead.
wasi-sdkas the C toolchain to configure the stage-1 GHC and finish the transition.
A longer term goal beyond Q1 is upstreaming Asterius as a proper wasm backend of
GHC. We need to play well with
wasi-sdk for this to happen, so another thing
we're working on in Q1 is: refactor the linker infrastructure to make it
LLVM-compliant, which means managing non-standard entities (e.g. static
pointers, JSFFI imports/exports) in a standard-compliant way.
In 2020 Q3 we mainly delivered:
- PIC(Position Independent Code) support. We worked on PIC since in the beginning, we thought it was a prerequisite of C/C++ support. Turned out it's not, but still PIC will be useful in the future when we implement dynamic linker and ghci support.
- Initial C/C++ support, using
wasi-sdkto compile C/C++ sources. Right now this doesn't work Cabal yet, so the C/C++ sources need to be manually added to
asterius/libcto be compiled and linked. We already replaced quite some legacy runtime shims with actual C code (e.g.
text), and more will come in the future.
Proper C/C++ support requires Asterius to be a proper
GHC which is configured to use
wasi-sdk as the underlying toolchain. The
immediate benefits are:
- Get rid of various hacks due to word size mismatch in the code emitted by
wasi-sdk. Some packages (e.g.
integer-gmp) are incompatible with these hacks.
- Implement proper Cabal integration and support
cbitsin user packages.
- Improve code size and runtime performance, getting rid of the
i32pointer casting everywhere.
- Get rid of
Thus the goal of 2020 Q4 is finishing the 32-bit cross GHC transition. The steps to achieve this is roughly:
- Detangle the host/wasm GHC API usage. Asterius will shift away from using
ghcof the host GHC and instead use its own stage-1 GHC API packages.
- Fix various issues when configuring GHC to target
wasi-sdkas the toolchain.
- Refactor the code generator and the runtime to work with the new 32-bit pointer convention.
Work in 2020 Q3 is focused on:
- Introducing C/C++ toolchain support. The first step is to introduce libc in the generated wasm code, and use libc functionality to replace certain runtime functionality (e.g. memory management). Once we're confident our runtime and generated code is compatible with libc, we'll look into building & linking C source files in Haskell packages.