viti95 Posted March 26 Some unexpected news, I've been able to add Hercules InColor support https://www.seasip.info/VintagePC/incolor.html 7 Share this post Link to post
zokum Posted March 26 17 hours ago, Darkcrafter07 said: John Carmack once said it in the 30 years doom chat with Romero that making more demanding games was more rewarding, as people spending more money to spin the whole industry. It could have been a reason they dropped further doom engine optimizations and mode 13h as more and more people were getting faster computers. There is another factor that could have led to such behavior - Doom has been a benchmark and it made people proud about their FPS digits. You know, it's the same thing that happens now with people buying hardware to outperform their neighbors in the FPS zone. Doom like a progress driver. Then Quake came out and they almost forgot about it, the time Q2 came out playing Doom was not a problem at all I believe. I seriously doubt that. They had fps limits in place and cut areas of maps if the frame rate was too low. With a slightly faster engine, they would just have built more complex maps or added other bells and whistles. One of the things they probably should have done was to optimize the blockmap algorithm and have it be a part of the loading routine and stop saving it on disk, saving disk space with the price of slightly longer loading times. Luckily they didn't so now we can tweak the blockmap to make special effects. 2 Share this post Link to post
zokum Posted March 26 21 hours ago, deathz0r said: Unfortunately, it breaks every single NoMo demo that exists on the DSDA archive. Stroller demos (which use -turbo 50) appear to be unaffected from testing a few of them. Of the top of my head, -turbo just multiplies your movement by a factor. In the demo your sr50 is stored as sr25 if you have a stroller turbo setting. 1 Share this post Link to post
Darkcrafter07 Posted March 30 So, I took my "retro-PC" out of the storage and gave a run to new fdoom. Cel 2.4GHz, FX 5500 128mb, 512 mb ram; Doom 2 timedemo results: fdoom I don't even run but it could be ~150fps, fdoom13h - 320 FPS something, fdm400d - 85fps, fdm400r - +200 fps. I didn't run 480 but everything since 600 didn't work due to "VESA2 not found" error. Yes, VESA2 modes worked at 400 exes, somehow. No gameplay like spontaneous network alike lags and silent teleportations glitches occured, unlike every PCem machine tried. The only bad thing for vesa exes now is a different sector lighting mode that looks like prboom which must really be fixed because it makes the game look wrong. For such machines where sbemu doesn't work, could you implement an opl3 synth emulator from chocolate doom? I got a DOS driver for Realtek AC97 which worked surprisingly. 2 Share this post Link to post
viti95 Posted April 2 I'll take a look, haven't tested any of my FX cards yet. Adding an OPL emulator is a bit of out of the main goal of the project, as now SBEMU works really well and support is much better. Also, we are much closer to a "1.0" release, and I was thinking about what a "1.0" release should have. For now, I've added support for Mode-X (320x240), very useful on laptops without VESA support and 640x480 screens, and the Hercules InColor support. What do you think the next release should have? 1 Share this post Link to post
zokum Posted April 2 (edited) I think I suggested it earlier, but there are some less common sound cards that most likely had less than perfect sb emulation where I remember having seen open sourced games with support for it. I looked at the page and there is no mention of any mt-32 support. That's classic hardware you could support. There's one thing I have thought about a few years ago. Both the AWE and GUS series allow for sample upload and sample playback and hardware mixing etc. Could you upload all the in-game sounds to these types of cards and just offload the work from the CPU? Monster sounds could get cut by instruments, but if it provides a noticeable speedup it would be an interesting choice. With music turned off, you should easily be able to provide more sounds at the same time than the usual 8. Any pitch-bend effects would also be easy to do. Graphically, I think low-color modes could benefit from upscaling the textures and using more pixels to dither them to better approximate the original textures (less banding). It would eat more memory, but 16 color textures in 2x2 size would only need twice the memory of a normal texture. Combine it with the technique below and we could have a winner. Another possible approach to speed it up would be to implement some form of fewer-textures mode. When two textures are sufficiently similar, the game will use the same texture in both places. If all the BIGDOORs were the same color, not a huge loss, but a nice memory save. You'd need special handling of switches to avoid losing important graphical cues. I haven't looked, but how does the one color wall mode handle switches? Is there some sort of workaround to make it stand out as a switch? Edited April 2 by zokum 2 Share this post Link to post
Darkcrafter07 Posted April 2 11 hours ago, viti95 said: I'll take a look, haven't tested any of my FX cards yet. Adding an OPL emulator is a bit of out of the main goal of the project, as now SBEMU works really well and support is much better. Also, we are much closer to a "1.0" release, and I was thinking about what a "1.0" release should have. For now, I've added support for Mode-X (320x240), very useful on laptops without VESA support and 640x480 screens, and the Hercules InColor support. What do you think the next release should have? I think you should add auto-run configuration key, now it's only possible to turn it off vie direct cfg file editing. The opl music slowdown is so bad on CPUs under 100MHz. I heard it on youtube that ISA slots have an 8MHz clock so maybe this is the reason it doesn't go along well with bus and CPU frequencies as those can not be divided by 8 without a remainder so the interrupts are spread in wrong places somehow and it slows down? Make it 6.6MHz so that results have no remainders. Then 8/6.6 = 1.21 coefficient to multiply midi tempo by to compensate for slower bus speeds? BS? 1 Share this post Link to post
viti95 Posted April 2 3 hours ago, Darkcrafter07 said: I think you should add auto-run configuration key, now it's only possible to turn it off vie direct cfg file editing. The opl music slowdown is so bad on CPUs under 100MHz. I heard it on youtube that ISA slots have an 8MHz clock so maybe this is the reason it doesn't go along well with bus and CPU frequencies as those can not be divided by 8 without a remainder so the interrupts are spread in wrong places somehow and it slows down? Make it 6.6MHz so that results have no remainders. Then 8/6.6 = 1.21 coefficient to multiply midi tempo by to compensate for slower bus speeds? BS? Autorun key is F12. In which config do you have slowdowns on OPL cards? Seems a bit strange to me, I've tried on very slow 386 cpu's and music seems to play fine (maybe my ears are not trained on this topic 😅). What I've found issues is on some Sound Blaster clones with fast CPUs, but those issues also happened on other games. 11 hours ago, zokum said: I think I suggested it earlier, but there are some less common sound cards that most likely had less than perfect sb emulation where I remember having seen open sourced games with support for it. I looked at the page and there is no mention of any mt-32 support. That's classic hardware you could support. There's one thing I have thought about a few years ago. Both the AWE and GUS series allow for sample upload and sample playback and hardware mixing etc. Could you upload all the in-game sounds to these types of cards and just offload the work from the CPU? Monster sounds could get cut by instruments, but if it provides a noticeable speedup it would be an interesting choice. With music turned off, you should easily be able to provide more sounds at the same time than the usual 8. Any pitch-bend effects would also be easy to do. Yeah MT-32 is still missing support, I do have one so I can try to support it better. As for hardware mixing, I think it's possible to support this very well on AWE cards, lot's of memory are available for samples (up to 28Mb). On the GUS this is a bit harder, memory is limited to 1Mb, and it's used close to 100%. But not all is lost on GUS cards, since old revisions of the Apogee Sound System did support multichannel sound hardware mixing. It was later removed in order to simplify the library. One of the things I wanted to do was to remove external dependencies on AWE and GUS cards, right now both are using external libraries for it's support. Again, DOS sound code is hard as f*ck, so any help here is very welcome. 11 hours ago, zokum said: Graphically, I think low-color modes could benefit from upscaling the textures and using more pixels to dither them to better approximate the original textures (less banding). It would eat more memory, but 16 color textures in 2x2 size would only need twice the memory of a normal texture. Combine it with the technique below and we could have a winner. Low color modes benefit a lot from higher resolutions + dithering, the main issue here is the ISA 8-bit bus bandwidth. It's extremely slow. That's why I had to remove "high resolution" modes like EGA 640x200 or ATI 640x200. Even with optimizations those modes run at 12 fps at best on time period hardware. Also we are not using direct rendering for low color modes, as this is only possible with a chunky pixel layout. I think this it's not very effective to render directly with 16 colors or less on the backbuffer, It's is better to render on 256 colors and use better/faster algorithms to reduce color depth. 0 Share this post Link to post
Darkcrafter07 Posted April 2 2 hours ago, viti95 said: Autorun key is F12. What a noob, sorry. I tested fdoom13h.exe in PCem with i486-DX2-66MHz and i486-DX4-100MHz. What is lower than 100MHz always slows down tempo wise by about 3%, it might be a quite sensitive thing in music. 0 Share this post Link to post
zokum Posted April 3 The GUS memory problem is only an issue if you have music turned on. If I could drop music and gain some fps, I'd do that over a lot of the graphical tweaks. You could also run with the 256kb or 512kb instrument set and use the leftover for sounds on 1 meg cards. The Gravis Ultrasound pnp supports up to 8 megabyte with 2x32pin simms. With pigus clones becoming a cheap reality, having access to 'GUS' is starting to be a very viable retro pc setup. 0 Share this post Link to post
zokum Posted April 3 When it comes to dithering, I was thinking of low-res modes. You often come close to a wall, and having dithering could improve the look. Moving in closer would reveal more image data from the original 256 color version. A shaded wall like the shawn/silver textures could look more shaded. 1 Share this post Link to post
viti95 Posted April 3 New bugfixing release! FastDoom 0.9.9b: https://github.com/viti95/FastDoom/releases/tag/0.9.9b Changelog: Hercules InColor support Mode-X 320x240 mode support (great for old laptops without VESA support and 640x480 display) Fixed MDA debug, now text don't blink all the time Fixed issue #183, random crashes on low-res executables Fixed issue #62, videocards with more than 8Mb crashed on VESA modes Removed "-singletics" commandline parameter. It was only used for debugging purposes. Upgraded build scripts Removed MS-DOS/Windows build scripts. Those were pretty much outdated, and all tooling now is Linux based. It's possible to build FastDoom on Windows using WSL2. 2 Share this post Link to post
viti95 Posted April 7 (edited) Aaaaand another small release! FastDoom 0.9.9c: https://github.com/viti95/FastDoom/releases/tag/0.9.9c Changelog: Serial MIDI support (fixed baudrate 38400, COM1 through COM4 selectable) Fixed 512x384 VESA modes Edited April 7 by viti95 1 Share this post Link to post
viti95 Posted May 30 New release! FastDoom 0.9.9d: https://github.com/viti95/FastDoom/releases/tag/0.9.9d Changelog: * VESA 400x300 modes support * Fixed issue #181 (FastDoom crashes on exit when using AWE32 music device + SoftMPU). Thanks @TheElf01 for finding this issue. * Multiple optimizations (C) * New VGA 320x100 executable. Uses same direct rendering method as vanilla Doom, but with half height resolution. High detail resolution is 320x100, low detail 160x100 and potato detail 80x100. Recommended for slow 386 cpu's with VGA cards. 2D elements also have half height resolution, so text is pretty much unreadable (unless someone creates a WAD with optimized fonts for this mode) * Fixed issue #192 (Save game buffer overflow). Now saving on MAP24 of Hell Revealed doesn't crash. Thanks @deat322 for finding this issue. * Added debug to file support for debugging (only for developers) 8 Share this post Link to post
Optimus Posted May 31 That's cool btw. I always was curious about an 100 lines version, with low detail to test an 160x100 version on my 386. Might be even better than if I don't prefer the potate 4x1 for pixel sizes. Pitty the player bar is not readable. 1 Share this post Link to post
Meerschweinmann Posted June 1 A really nice project and very interesting especially from the technical standpoint. Back in the days that would have been THE solution to get DOOM running much smoother before we got our 66MHz+ 486 VL-bus computers. Keep on with the good work. 1 Share this post Link to post
Darkcrafter07 Posted June 1 That's awesome news, time to get my old pc and give it a test. Would have been killer if hi-res VESA modes fixed the brighter diminished lighting issue. Btw, a new version of SBEMU came out that finally worked on my SiS Realtek AC'97 but what a pity, fdoom says there'sno such music and sound device =-1 whatever I'd choose in the setup, I'm running in MS-DOS 7.10 which comes with Windows 98SE btw. I use HDPMi32, QEMM and mouse.com driver from ms-dos 6.22. It works ok in the original doom executables, slows down a bit in Duke3D and games like Dune2 don't launch as they don't see xms memory and it has like 512MB ram ddr. Regarding the quality of such opl emulation I think it could have gotten enhaced a little, as even DosBox sounds better on high end percussion like hats, hopefully a setting for determining a sampling rate at 49716Hz helps a little. 0 Share this post Link to post
viti95 Posted June 2 45 minutes ago, Darkcrafter07 said: Would have been killer if hi-res VESA modes fixed the brighter diminished lighting issue. I've found what was causing the diminished lightning issue, so the fix will be available on the next release. 2 Share this post Link to post
Darkcrafter07 Posted June 2 Awesome news, must something related to resolution like a constant that was only calculated to work properly with 320x200 resolution. Btw, I love those sawwy textures, could be a nice alternative to linear filtering if you made it that way, just wondering... 0 Share this post Link to post
Optimus Posted June 3 (edited) Tried timedemo 1 on my beefy 386DX 40mhz with ISA Tseng Labs. Also tried against commercial Doom. Full screen high detail then half detail High detail Doom: 7.03 FDoom: 9.247 Fdoom13h: 10.842 Half detail: Doom: 12 FDoom: 16.379 Fdoom13h: 15.602 Finally tried the new FDoomH: Full detail (320x100): 13.938 Half detail(160x100): 23.693 Interesting results matching my expectations. I am curious of Fdoom13h as it doesn't use the mode-x and have to write bytes in columns on slow VGA, but writes on the buffer and I guess fast 32bit copy of the buffer to vram on last step. So it ends up being even a bit faster than regular FDoom unless you go for half detail. Funny I tried at some point before on 486 or Pentium crippled with ISA gfx card and mode13h really paid off, but with Vesa Local/PCI card results were different. On 386 of course because CPU is main bottleneck, there is improvement but slight. You might prefer to play half res anyway, where again as I guessed now the FDoom instead of the 13h is advantage (16.379 against 15.602) because I guess the double pixel mode-x trick is utilized here. Finally was FDoomH or half-res FDoom advantage? Was 320x100 or 160x200 better? And to my guesses the less columns were better than less vertical rendering. FDoomH may be better if you want to go for the 160x100 experience. My favorite is Half Detail FDoom when I play on 386 and Full detail on 486 (I think FDoom13h didn't give or it did in the Pentium with PCI but lost on 486 slightly with VLB). I am also surprised even full detail fullscreen on 386 is less choppy than old Doom and kinda managable. Maybe with a bit of smaller window it's ok. 3 Share this post Link to post
GooberMan Posted June 3 On 6/2/2024 at 1:23 PM, Darkcrafter07 said: Awesome news, must something related to resolution like a constant that was only calculated to work properly with 320x200 resolution. It's not a constant. The original flat renderer basically does lookups at 16-bit resolution instead of 32-bit resolution, and in a very cheating way by sticking X and Y in one 32-bit integer to do the texel lookup calculations (so it overflows and loses accuracy very easily). And as visplanes render left-to-right by row, what you get as you get closer to the right hand side of the screen is those kinds of inaccuracies. You should be able to notice how much those pixels swim when you rotate your view. Even increasing the resolution to 32-bit isn't quite good enough as you go higher and higher with your resolutions. But hey, this is FastDoom. That original code is very fast for the fidelity levels it targets. I'd say fixing that for higher resolutions is outside of the scope of the project, but it also isn't my project. 2 Share this post Link to post
Darkcrafter07 Posted June 3 @GooberMan, indeed, it's much faster and it gets even faster on modern hardware. I'm wondering if the faster hardware gets even bigger coefficient of difference between mode "13h exe" and "vanilla exes". Unfortunately, I don't have such retro systems but on PCem with 486 and pentium fdoom13h is almost exactly twice as fast as vanilla and on a system with celeron at 2.4GHz and FX5500 128mb the coefficient gets "2.8". ... here's a bit of flood btw: Spoiler gzdoom's abandoned "poly renderer - experimental" is like twice as fast as "doom software renderer" in gzdoom and lzdoom on the old celeron machine and my hell renaissance project that has a lot of 3d floors, stacked sectors, portals and models. It's a rasterizer for sure and lacks affine texture mapping techniques for rendering map geometry (but is it still utilized for rendering 3d models?). There are some questions arise: It's like a quake renderer but for gzdoom, that means it would work slowly for 486? What could be the source of such speed up in comparison to software renderer? Coding wizardry? SIMD? Out of scope but what if fast doom used MMX and SSE for pentium machines too, it's like drawing 4 pixels for the price of one? Could a rasterizer like "poly renderer - experimental" be ported to DOS source ports like fast doom and ace engine to increase performance? 0 Share this post Link to post
Meerschweinmann Posted June 4 (edited) 19 hours ago, Optimus said: High detail Doom: 7.03 FDoom: 9.247 Fdoom13h: 10.842 Even 10.842 fps are by far not buttery smooth at todays standards, we would have been happy back in the days about such improvements. Looks like a 486 DX2-66 could handle Fastdoom like a 486 DX4-100 handles vanilla DOOM. Those high clocked 486 CPUs were very expensive back in the days and Fastdoom could have been the solution to save money. Unfortunately i don't have my 486 anymore to confirm that. Edited June 4 by Meerschweinmann 1 Share this post Link to post
viti95 Posted June 4 19 hours ago, Optimus said: I am curious of Fdoom13h as it doesn't use the mode-x and have to write bytes in columns on slow VGA, but writes on the buffer and I guess fast 32bit copy of the buffer to vram on last step. So it ends up being even a bit faster than regular FDoom unless you go for half detail. Fdoom13h is a bit faster sometimes on full detail mode, because drawing columns and spans don't require to change between planes using OUT instructions, and the ASM code from Heretic uses better the x86 register set (no "scratchpad" memory is used). Also it has two different ways to copy the backbuffer to the VRAM. First one is using 32-bit REP MOVS (fast bus speed option), which is good on the 386. The slow option does a differential copy, updating only required pixels (this option works fine with fast cpu's and very slow 8-bit ISA bus). 19 hours ago, Optimus said: You might prefer to play half res anyway, where again as I guessed now the FDoom instead of the 13h is advantage (16.379 against 15.602) because I guess the double pixel mode-x trick is utilized here. Yep, the double pixel (or the potato quad pixel) VGA optimization works really well here, as the ISA bus is very slow and bottlenecks quite easily. The less you use it, the better. 19 hours ago, Optimus said: FDoom Half detail (160x200): 16.379 FDoomH Full detail (320x100): 13.938 Finally was FDoomH or half-res FDoom advantage? Was 320x100 or 160x200 better? And to my guesses the less columns were better than less vertical rendering. FDoomH may be better if you want to go for the 160x100 experience. I think FDoomH is a bit slower is due to the number of OUT instructions used per frame to change between planes, at least 320 are required for FDoomH while Fdoom half detail only requires 160. 1 Share this post Link to post
viti95 Posted June 18 (edited) New release, FastDoom 0.9.9e! Changelog: * Fixed diminished lightning issues on high resolution VESA modes * Fixed some 2D status bar and menus misalignements * Fixed rendering corruption on VESA 386SX codepath * Fixed MS-DOS 7 crash #187 * New fonts for 320x100 resultion modes (half height) * Optimizations (C) https://github.com/viti95/FastDoom/releases/download/0.9.9e/FastDoom_0.9.9e.zip 4 Share this post Link to post
Darkcrafter07 Posted June 18 @viti95 it's awesome, I've been telling people to try your port here and there in a while, maybe they make some more videos running it on the real hardware! 0 Share this post Link to post
MrFlibble Posted June 18 Apologies for what is probably a really dumb question, but I was wondering. It appears that on a 386, the only way to have a decent frame rate is the potato mode, which lowers down the resolution very significantly. On the other hand, the admittedly much more simple Wolfenstein 3-D can play on a 386 at a decent speed with almost fullscreen and everything drawn at the max resolution. @viti95, if you had control both over the game engine and the content, do you think it would be possible to achieve playable frame rates on a 386 without the need to lower the resolution/detail? Let's say you'd not only turn off the rendering of floors, ceilings and skyboxes, but all sprites and textures would have half of Doom's resolution, and you could make any simplifications to level geometry you'd find necessary to achieve playable speeds. Of would you just need to go back to Wolf3D's limitations of 90 degree walls and no height differences to achieve this? 0 Share this post Link to post
viti95 Posted June 19 (edited) On 6/18/2024 at 9:19 PM, MrFlibble said: Apologies for what is probably a really dumb question, but I was wondering. It appears that on a 386, the only way to have a decent frame rate is the potato mode, which lowers down the resolution very significantly. On the other hand, the admittedly much more simple Wolfenstein 3-D can play on a 386 at a decent speed with almost fullscreen and everything drawn at the max resolution. @viti95, if you had control both over the game engine and the content, do you think it would be possible to achieve playable frame rates on a 386 without the need to lower the resolution/detail? Let's say you'd not only turn off the rendering of floors, ceilings and skyboxes, but all sprites and textures would have half of Doom's resolution, and you could make any simplifications to level geometry you'd find necessary to achieve playable speeds. Of would you just need to go back to Wolf3D's limitations of 90 degree walls and no height differences to achieve this? Wolfenstein 3D uses lot's of clever tricks to make it playable on 286 machines, which are not usable on Doom's engine. For example there are no heights, no visplanes and only 90 degree walls that makes computations way faster. It's very fast on high detail as sprites and walls are precompiled on each level, making scaling really fast on VGA cards (multiple columns are rendered in a single pass, see Game Engine Black Book: Wolfenstein 3D by Fabien Sanglard). Also "AI" from enemies in Wolf3D is much simpler compared to Doom. All in all the 386 doesn't have the required power to move Doom properly without doing some serious cutdowns. Edited June 19 by viti95 2 Share this post Link to post
Blzut3 Posted June 20 From a rendering point of view, I'm not sure the 90 degree wall limitation actually provides a huge performance benefit. After all the SNES and derived ports switched to rendering walls like Doom for additional performance and I can't really imagine that allowing arbitrary segs would make a huge difference there. However the tile based nature of the maps definitely does play a role in simplifying things throughout the code, so arbitrary decorative wall angles would be somewhat limited in use while retaining everything else. To that end, a note about the AI: Notably the path finding function is the same from Catacomb up to at least Doom 3, but it is true that said function's decision is used in Wolf3D to just pick the next open tile to move to which means Wolf3D didn't have to do what one would normally consider collision detection for anything but the player and projectiles. 2 Share this post Link to post
MrFlibble Posted June 20 (edited) 13 hours ago, viti95 said: All in all the 386 doesn't have the required power to move Doom properly without doing some serious cutdowns. Can you give a rough estimate of how much one would need to sacrifice in terms of level geometry's complexity (and/or internal structure of the data) for a speed improvement? Or would it be just "easier" to write a completely new rendering engine from scratch that'd be optimized for the 386 architecture? If we just put Doom aside, do you think that it is possible to create an engine that would be more advanced than that of Wolf3D and replicate at least some of Doom's features, but run better on a 386DX at least? BTW, have you considered rendering distance as a detail level variable? For example, both Bethesda's Arena and Daggerfall have this "detail" slider that actually controls the viewing distance, which affects the frame rate quite notably. 0 Share this post Link to post