Currently am upscaling everything in my big diffusion pretrain dataset to start training some real structure.
If a couple epochs of that data doesn't activate the model, I'll need to employ a David structure and attempt to teach global attention to a shared battery set.
Simultaneously heavy experimentation on the geolip-aleph-void structure and potential offshoot objectives are being transcribed and curated. There are multiple prototypes based on functional known structures that have potential and among the discoveries today include a stable attention mechanism that can be curated further. This is based off an earlier experiment named cantor fractal routing.
This system was a badly optimized prototype that managed to stabilize deep-complexity fractal routes with low vram at the cost of time. Primary problem with it, was the time only matters if you're training a massive model. You don't get benefit from small models like how I usually train, so it was mothballed.
The geolip aleph routed attention is a viable option to train a david and it can in fact handle small models but needs much testing. As it stands it does not benefit from the same large model routing optimizations for vram as the cantor fractal routing. This essentially means that it will OOM like traditional attention. However, because it's based on the aleph structure it'll stabilize point clouds for Q and K, which when employed structurally can provide a cached V. I'm testing structural changes that will allow the structure to bind deterministic systems to K so KV caching can happen and Q can operate normally.
With the aleph routed attention worked out I'll be able to provide an actual backbone to SDXL instead of just a partial one through tokens. This will allow the model to directly differentiate tokens through gated learning and attention anchoring, which in theory could enable surge training through procrustes. They are essentially different towers though, so I'm uncertain still if the effect will transcribe or be topical until after the experiments.

