Some random notes about Besta RTOS. Will probably ended up on a wiki somewhere after Project Muteki become mostly usable.
NOPE. Not even close.
For unknown reasons, instead of going through the usual process of configuring a customizable compiler like GCC and making a dedicated toolchain for building the applets for their custom OS, seems that Besta decided to bend MSVC CE toolchain to do so despite the OS being not even remotely related to Windows CE. This is also likely the reason why coredll.dll
was included in all the known system images, because coredll.dll
provides several OS-independent compiler helpers—namely software-based floating point arithmetic, 64-bit arithmetic and integer division routines—that the MSVC compiler needs. As an unfortunate side effect, this also caused a lot of things to be broken including C++ exceptions and threading/TLS due to Windows CE-specific helpers obviously not working on a completely different OS.
For developing Besta RTOS applets with Windows CE toolchain, custom CRT0 must be used and coredll functions must NOT be used for anything unless the functions are OS-independent, like the previously mentioned helper routines. All syscalls must either go through sdklib/krnllib or be invoked via bare SVCs (e.g. with help from mutekishims).
There are also some Win32API-looking routines provided by sdklib/krnllib, although there are no technical reasons for them to be Windows-ish, other than perhaps providing some familiarity to Windows devs. The fact that they can also sometimes be not completely compatible with the real Win32APIs also ironically makes it a point of frustration for lesser experienced Besta RTOS developers.
Looking at the scheduler code, seems like the Besta RTOS is based on a heavily modified uC/OS-II kernel with a drastically different set of OS API exposed to user code. The rest of the system seems to be developed in-house, sometimes utilizing various existing open-source (e.g. SQLite3, FFmpeg, YAFFS2) and closed-source (e.g. Voxware) components.
Maybe. I haven't tried it yet and I'm not a lawyer but maybe.
Hidden modes (diagnostics and DFU)
Diagnostic mode can be used to identify the board type, which is usually a 5 character identifier starts with BA, CA, EA, JA, KA, etc. that is different from the device model number. It can also be used to verify the integrity of the system ROM. On some newer TLCS-900 based systems where a diagnostic menu is known to exist (tested on CA736) it can also be used to dump the system ROM currently installed onto the on-board NAND flash.
DFU may be able to be used to save a bricked device caused by corruption of OS/system image (TODO: needs to verify how much the OS can be corrupted before the DFU was also rendered useless. More NAND dumps would be needed for this).
Open the TAD app (it's usually called Service Home, 服务中心 (服務中心), etc. depending on your language settings) and type "diagnostic" using the keyboard. This works on all Arm-based devices with some notable exceptions and some later devices based on the TLCS-900 architecture.
For HP Prime calculators, holding C, F and O button and pressing the Reset button enters the diagnostic mode.
For Sharp dictionaries (at least for JA739), the diagnostics mode is not present and diagnose.exe
is not included in the system image.
For Pocket Challenge, it seems to have an external diagnostic menu included in the update SD cards. The system image does also include diagnose.exe
but currently it's not certain whether or not it functions properly.
Hold the P button and press RESET to enter DFU mode.
On some systems, the key combo may be slightly different. Specifically on BA742, one needs to hold the P button and press the power button instead.
Once called (in Arm mode), push r0
then lr
to the stack in this exact order (needs 2 instrucrtions) and initiate an SVC call with the desired syscall #.
Syscalls will not work directly in THUMB mode due to instruction size limit. Interwork is needed in order to do syscall from THUMB code.
Besta RTOS uses a single address space memory layout with the kernel, applets and shared libraries all sharing the same address space. There's no MMU support even on SoCs with MMU support so there's zero memory protection, meaning user space code can have direct access to hardware registers, etc. Beware that this also makes NULL an valid address and this will cause NULL dereferencing to be harder to debug.
The kernel mode and user mode share the same heap, although there are several allocators allocated on the main heap for several data structures such as file descriptors and UI elements, possibly to avoid heap fragmentation.
Applet executables are mainly in PE format with ELF being an alternative option. The Windows CE subsystem type is not required although elf2bestape
would add it for consistency with some newer Besta RTOS applets. Applet can either be relocatable (true for most of the PE files) or loadable to an absolute base address, although applet of latter type is in practice only runnable if it was the "init" program (i.e. first program to run after the OS is initialized).
It's unclear whether ELF applets support relocation or not since the only one ELF applet known to exist is Prime G1's armfir.elf
and it loads to an absolute address.
Like applets, shared libraries can either be in PE format or ELF. The former used by applets either statically via import table or dynamically with _LoadLibrary*()
latter is used mostly by the kernel as some sort of "kernel module". The _start
function seems to get ignored when loading them.
Like under Windows, putting a shared library in the same directory as the applet overshadows the system version. This could be used to e.g. trace syscalls.
The thread model seems to be very similar to uC/OS-II (down to the algorithm level almost source-line-to-source-line), although the public API is totally different.
Total number of 64 threads can be created at the same time. With 38 threads accessible directly via OSCreateThread()
.
The code to handle THUMB mode in the CPU context initializer, which is in stock uC/OS-II's Arm Generic port, seems to be missing. Does that mean the THUMB mode is broken? (Maybe not but use THUMB function as an entry point might not work with workarounds i.e. interwork function or patching the saved CPSR. Given that we might need a stack aligner for EABI->OABI conversion anyway this might not be so bad.)
Priority is implied in the natural order of the threads in the global thread table (uC/OS-II just calls this priority table). Some slots in the table seem to be reserved (8 for the top and 18 for the bottom) and are not accessible by just allocating the thread with OSCreateThread()
. User can move threads to these reserved slots by calling the OSSetThreadPriority()
function.
The scheduler always executes the task that has the highest priority, so using OSSleep()
is necessary to prevent one thread getting hold of the CPU for too long.
(TODO figure out if the thread can be yielded when waiting for IO)
1 jiffy is 1ms.
The OSSleep(jiffies)
syscall calls the OSTimeDly()
function in uC/OS-II scheduler, which then put the thread to sleep for specified amount of jiffies. Since 1 jiffy is 1ms in Besta RTOS, this practically delays the thread for less than or equal to the specified amount of milliseconds.
There's also a Delay()
syscall that delays beyond INT16_MAX jiffies all within a single SVC call.
Events are stripped down version of uC/OS-II mboxes. They don't have the ability to pass arbitrary message pointers like mboxes do.
Events have one extra flag compare to mboxes. Once set by OSCreateEvent()
, it will prevent the event flag from cleared once a OSWaitForEvent()
call completes without hitting a timeout or error.
Critical sections provide mutually exclusive access to shared resources between threads. They seem to be recursive (as the context struct seems to hold a copy of the reent struct whenever it enters from the same thread/has the same reent struct pointer).
When a thread acquires a free critical section, it only changes the state of that critical section and nothing else on the kernel side is touched. (Unless, of course, when another thread tries to acquire the same critical section. Then that thread will be set to wait for that critical section.)
They also seem to have some kind of index value and a byte array for unknown purpose. More investigations needed. These are standard uC/OS-II thread wait states.
Since the context holds a copy of the current thread pointer, it is possible to use critical sections to know which thread is currently running. To do this, create a critical section locally first. This ensures that no other thread is acquiring it. After this, simply acquire the descriptor with OSEnterCriticalSection()
and read out the pointer.
One safe implementation (4 syscalls) is shown as follows:
#include <muteki/threading.h>
thread_t *get_current_thread() {
thread_t *thr = NULL;
critical_section_t mutex;
OSInitCriticalSection(&mutex);
OSEnterCriticalSection(&mutex);
thr = mutex.thr;
OSLeaveCriticalSection(&mutex);
OSDeleteCriticalSection(&mutex);
return thr;
}
There is also a faster but hackier way. It abuses an implementation detail of the critical section that there's no other resource allocated/state changed when a free critical section is acquired for the first time. By only barely initializing the critical section and call OSEnterCriticalSection()
without any clean up, this brings down the number of syscalls required to only 1. This works on both CD-580+ and WuDi V7.
#include <muteki/threading.h>
thread_t *get_current_thread() {
critical_section_t mutex;
// Magic is not checked so not needed here
mutex.thr = NULL;
mutex.refcount = 0;
OSEnterCriticalSection(&mutex);
return mutex.thr;
}
Error code is stored in the thread descriptors.
Code set via OSSetLastError will have the flag 0x20000000
set when read back by _GetLastError()
.
See muteki/errno.h for error codes documented by parsing FormatMessage()
string table.
See hca.xxdm
Would be useful for e.g. emulators.
GetActiveVRamAddress()
returns a framebuffer descriptor. It includes the framebuffer as well as its format. This could be a potential way of accessing the framebuffer with e.g. a sw renderer that has no tie to the kernel.
TODO.
(OpenPCMCodec
and ClosePCMCodec
look suspicious)
This can be used to e.g. simplify syscall black box testing or implement untethered other OS booting.
Create a file under C:\SYSTEM\DESKTOP.INI
with DOS line ending and put
[DESKTOP SETTING]
ENTRY = <dos-8.3-path-to-exe-you-want-to-run>
into the file.
WARNING: This will replace the home screen with the file you specified and might cause the system to not boot properly. If this happens, a full system reset (clearing settings and wiping C: drive) will fix it although it will erase all data in system memory and settings. Alternatively, if chainloading a secondary program is possible, you can also run a program that can help you recover from this situation (e.g. using this will not be possible on most systems without a loader that strips the v4 args).\\.\EXPLORE.ROM
to delete the ini file and reboot
PATH_MAX
is 256 UTF-16 CUs (512 bytes) with NUL terminator. For the 8.3 paths used by CreateFile()
, PATH_MAX
is 80 bytes with NUL terminator. For 8.3 paths in CWD, PATH_MAX
seems to be 64 bytes with NUL terminator.
Neither am I.