On every machine in the cluster install openmpi and mlx-lm:
conda install conda-forge::openmpi
pip install -U mlx-lmNext download the pipeline parallel run script. Download it to the same path on every machine:
On every machine in the cluster install openmpi and mlx-lm:
conda install conda-forge::openmpi
pip install -U mlx-lmNext download the pipeline parallel run script. Download it to the same path on every machine:
| import struct, time | |
| ## This experiment demonstrates that the claimed "hash" (that is not a hash) used | |
| ## by the L2 cache ECC debug feature used by Operation Triangulation is not secure, | |
| ## and can be trivially reverse engineered by anyone who owns one of the machines | |
| ## with the hardware (such as any M1 Mac), in seconds to days. Therefore, this proves | |
| ## that no "insider" access or leak is necessary to obtain this table, and that the | |
| ## attackers most likely did exactly the same thing. | |
| ## This is the "black box", i.e. the hardware: The table is not exposed to the caller. | |
| class BlackBox: |
For the past six months I've been learning about hacking my Wii U. I could have completed this project in a weekend, but sometimes I get an itch to go further.
My goal has been to have the ultimate couch console where friends can play video games in the same place in front of the same screen (plus gamepad screen). After that first weekend I was able to play homebrew, and make game backups to play. It even came with new software to use PS3 controllers on the console as pro controllers. My console had become really cool, but it wasn't perfect. So, I began working on getting it from 75% to perfect to 95% perfect (see unfixed cons below).
The documentation for that first 75% is really good and simple, but it's so simple the learning curve to do more advance things is steep. The research involved included digging through decade old forum posts, out of date wikis, finding files in abandoned MEGA drive downloads, reading source code in a dozens of repos, and lots of t
| # IDA (disassembler) and Hex-Rays (decompiler) plugin for Apple AMX | |
| # | |
| # WIP research. (This was edited to add more info after someone posted it to | |
| # Hacker News. Click "Revisions" to see full changes.) | |
| # | |
| # Copyright (c) 2020 dougallj | |
| # Based on Python port of VMX intrinsics plugin: | |
| # Copyright (c) 2019 w4kfu - Synacktiv |
| server: | |
| logfile: "" | |
| # verbosity: 2 | |
| username: "nobody" | |
| interface: 0.0.0.0 | |
| access-control: 0.0.0.0/0 allow | |
| prefetch: yes | |
| # include: "/opt/unbound/local.conf" | |
| # include: "/opt/unbound/customize.conf" |
| FROM amazonlinux:2 | |
| RUN yum -y update && \ | |
| yum -y groupinstall -y 'Development Tools' && \ | |
| yum -y install git zlib-devel perl-core | |
| RUN git clone --depth=1 https://github.com/openssl/openssl && \ | |
| (cd openssl && ./config no-shared --prefix=$(pwd)/usr && make install) | |
This document was originally written several years ago. At the time I was working as an execution core verification engineer at Arm. The following points are coloured heavily by working in and around the execution cores of various processors. Apply a pinch of salt; points contain varying degrees of opinion.
It is still my opinion that RISC-V could be much better designed; though I will also say that if I was building a 32 or 64-bit CPU today I'd likely implement the architecture to benefit from the existing tooling.
Mostly based upon the RISC-V ISA spec v2.0. Some updates have been made for v2.2
The RISC-V ISA has pursued minimalism to a fault. There is a large emphasis on minimizing instruction count, normalizing encoding, etc. This pursuit of minimalism has resulted in false orthogonalities (such as reusing the same instruction for branches, calls and returns) and a requirement for superfluous instructions which impacts code density both in terms of size and