Is it worth having 2X3090s in NVLINK when it does not work in Daz?
RexRed
Posts: 1,318
Is it worth having 2X3090s in NVLINK when it does not work in Daz?
Post edited by RexRed on
Comments
NVLink does work in DAZ but as per my test, there's no significant performance improvement in terms of rendering. I ever used some typical scenes for the test, 3 - 5% render time saving.
I did feel some performance improvement when manipulating my scene in DAZ viewport. Maybe it was a false impression, but still at the end of day, I don't think it's really worth having NVLink~
I consider this thread "solved", I turned off SLI, had to shut down a lot of apps running in the background to do it. (including Windows Search which took me forever to find.)
Then my monitors blinked off and I hit enter while it was dark and Nvidia uninstalled SLI.
Now I can load in Iray and render the Viking Camp (high res version) with my GPUs!
So YES it is worth having NVLINK!
With SLI on, NVLINK does not function properly.
It sends texture data to the wrong GPU and textures do not render right at all!
Before I turned off SLI I could not even get the Viking cam to show in Iray without deleting half of the items in the scene.
And when I added a number 2 to the NVLINK peer group size in the render advanced settings my textures in my scene would go all wonky.
If you have 2 X 3090s connected by NVLINK turn off SLI... And Daz works great! Huge gain!
Add a number 2 to the NVLINK peer group size in the render advanced settings! And you are good to go!
My understanding has always been that the benefit of NVLink is VRAM pooling, not any performance boost beyond having the two cards running independently.
I don't think so~ Not sure what driver do you use, Gaming or Studio. If you deactivate SLI in Nvidia Control Panel, there'll be no NVLink support from software perspective, even if you attach the link to your cards. Then you may read the DAZ log after rendering and check the cards related info.... each card worked separately.
I made the test a year ago, never experienced such a problem you mentioned. From peer group 1 - 4, with SLI on and off, with the link attached or not attached, there was no problem at all in terms of previewing, rendering, etc.. just the performance with Link did not impress me~~ Then I sold the link~
Actually I think I am incorrect (sorry).
I did not notice I had CPU fallback checed on the beta.
Turning off SLI does get rid of the wonky texture problem and in Windows task manger it does say i have 47.8 gb of shared memory.
It would seem that the bottom line here is, memory pooling does not yet work in Daz Studio.
It duplicates the same textures on both cards and with memory pooling on it currently corrupts the textures in the scene.
The NVLINK bridge is probably best left off.
The reality is I bought two $1500 graphics cards (and an $80 bridge) and still only have 24gb of video memory.
Never mind~ It's fairly good you have two 3090s that are even better than a single 4090 in terms of total CUDAs and overall performance. Don't forget there're still many ppl are developing their patience while using 10 & 20 series cards for rendering ~
For example in the benchmarking thread, there are people with NVLink working.
If you are basically using the Nvidia Workstation GPU Driver and are using 2 3090s linked, you should have WAY more rendering speed in Daz compared to someone with one. Also, it won't be blamed on your cards soley as the problem if you still have issues rendering. Daz is a program that is known to not be optimized well at all for years and many in the Daz community keeps begging the devs to use the money they get to Optimize and improve this program to be a good 3d modeling/posing/animating program.
Using nVlink would usually be slower, due to the overhead, than using each card separately - the benefit, as noted above, is the ability to handle more data for materials - if you have enough system memory to feed it.
I have a friend who has an AMD system with (I believe) 24 or 32 cores running at 5ghz, and I have an Intel machine with 18 cores running at 3.0ghz.
We are both using the latest Nvidia studio driver.
Now this problem could be NVidia and not Daz...
But the problem exists across the board.
My friend tells me that NVLINK memory pooling used to work with Daz.
We are both advanced users of Daz. He has over 5000 Daz assets and I have nearly 10,000 assets.
I have been rendering 3D art since VistaPro on my old Commodore 64.
Now I do not know everything about this area, I just like to put assets together and make scenes.
I push the render button (or Ctrl + R on my PC) and hope it works.
Let's think about this logistically.
Correct me if I am wrong.
If you have two graphics cards, for example RTX 3090s, if you do not use the NVLINK bridge.
You can have each card render a frame at a time.
This in my mind would double your rendering time.
If this is what you want then that is perfect.
But in this scenario, the scene would need to fit entirely within each card.
Much like Bryce Lightning used to work.
So the gains in that case would be doubling your render speed.
On the other hand, pooled memory would take a scene that will not fit entirely in either card and split the geometry and textures up somehow, so each card holds half of the scene.
This is a different thing.
It would stand to reason that if each card is switched from working in tandem but instead pooling resources, one might not gain the doubling effect.
It may no longer render double the speed.
Rather, one could take a scene that would not fit entirely in a card and render it anyway, therefore not sending it to the PC.
This saves time because the PC is perhaps 10% or less the speed of the GPUs.
So pooling memory you lose the doubling effect of speed (I would assume) but gain the incredible speed over having the PC render it.
The way you toggle this effect is in Daz Advanced Render tab,
By default, Daz loads the scene entirely into each card.
If you go in and add a number 2 to the NVLINK peer group size (change it from zero)
It should fit a scene too big for one card into two cards.
With both scenarios there are huge speed gains.
Currently the first method of fitting the scene entirely in both cards. That works but I have my suspicions that it has some bugs with VDBs and screen tearing, like what running a game without V-Sync sort of looks like. This problem is not evident in the render that I have seen.
Overall, that works great.
On the other hand.
The method of adding the number 2 to the peer group option.
Doing that causes the textures to get all messed up like, bump maps for one surface has been sent to the other card, or color parameters sent to another, and it is haphazard like this for every surface.
My friend with the AMD machine has observed the same problem.
I am not blaming Daz, it could be Nvidia, heck it could even be Windows 11, maybe Direct X or some other thing.
But it is happening to two different people with the same 3090 chips with different manufacturers cards, same Nvidia drivers and different (AMD vs Intel) state-of-the-art processors and system RAM.
Why is this broken? I don't know, it just is.
My friend says it used to work.
This is why I am asking this question.
I would like to be able to toggle this effect so for large scenes i.e., the Viking Camp so I can fit a scene partly into each card (rather than PC render).
Or for smaller scenes I can fit the entire scene into each card.
I believe that is the way pooling works anyway, it only divides the scene off if it is too large for the cards.
This pooling method is broken.
I am not really upset about this because otherwise, Daz works fine with smaller scenes and if necessary I can just use my PC to render, even if it takes all night and most of the next day.
I am just making people aware that memory pooling is for some reason or another, "broken".
The fault should be investigated and if anyone has a solution please share it here.
It is likely the fault is with Nvidia. Recently we went for a few months with Dforce broken in the same way until Nvidia released a fixed driver.
I would simply like to see this fixed in the not-too-distant future.
I hope the buck simply does not just get passed around.
Best regards
RR
Just to clarify, I am the person with the AMD system. This is a visual depiction of what is happening:
This weird texture anomaly only happens when NVLINK Peer group size is set to 2.
According to DAZ's limited documentation and other information, to get VRAM pooling to work, SLI must be enabled (This is standard to have NVLINK work from NVIDIA), and the NVLINK Bridge must be connected. NVLINK Peer Group Size needs to be set to the number of cards bridged, in my case 2. However, when this combination is employed, the textures become all terrible like you see above.
System: AMD Threadripper 3960x, 128GB RAM, 2x PNY RTX 3090 REVEL, EVGA NVLINK Bridge 4-slot, always running latest studio drivers and most up to date version of Windows 11.
It took me some time to revisit all these. Sorry ~
It was exactly the same setup I ever used 1.5 years ago for such a test like what you guys are doing now. Hardwares at that time: AMD Threadripper 3990x, 256GB RAM, 2 Quadro8000, 2 Quadro6000, 1 Nvidia NVLink; System/Softwars: Win10, Nvidia RTXQuadro Studio Driver, DAZ 4.15.
Again, same setup: NVLink connected to 2 Quadro8000, SLI enabled in Nvidia Control Panel. With SLI enabled, the information in Daz Render Settings - Hardware - Devices was different from 'Disabled' , then I was happy that DAZ could detect the link ~
Followingly I made a series of tests from simple to complex scenes, from Peer Group 0 to 4, all ran well with no problem, only I didn't get the expected performance in terms of rendering time, around 5% time saving mostly. But I was sure of one thing after checking the log: DAZ + NVLink + SLI setup did work ~ However one day after a couple of months, I updated Nvidia driver, then the SLI was automatically disabled and I found that 'Green Line' in panel acrossed 4 cards other than 2 cards as before. If I enabled it, my monitor could not be lit. It only worked for 2 cards plugged. That was the root reason I finally gave it up~ With or without Nvlink, it was not a big deal to me anyway~
Never seen such a texture problem before. Have you reported it to DAZ? It may well be such the issue results from the driver, my gut feeling ~~
I haven't reported it to DAZ, I thought it was just me, and was going to replace my NVLINK bridge as a first step, thinking maybe it wasn't working correctly (My VRAM pooling in Windows also shows some strange numbers so that's why I decided to replace the bridge). But then I ran into Rex, and he's having the same exact issue (His VRAM reports correctly to windows) when NVLINK Peer Group Size is set to 2. So after this animation I am working on is done rendering, I will clear out the logs, and setup a render and report it to DAZ with fresh logs.
As I understand it, the fact it's in SLI/NVLINK will actually give it a small performance hit not a boost, but the key feature is VRAM Pooling which allows for really complex scenes (Texture-wise not geometry-wise). Most of my scenes currently will not benefit from VRAM pooling, so having it disabled is ok. However, there are a couple of times a month where my scenes do exceed the 24GB of each card, so having the VRAM Pooling is better than not, otherwise I am having to CPU render which takes forever...
Testing SLI NVLINK, Memory Pooling and CPU Rendering In Daz Studio
I made a video testing these various connections.
Oh my~~ that's you ??!!
Yea, that's me, lol
I watched 1/3 of your long video in the morning, fast-forwarded sometimes . Just a reminder: if you insist on using Task Manager to observe the GPU load, you better choose CUDA from the dropdown list on the right panel then the data should be more stable. I still suggest you use GPU-Z. GPU load, VRAM used, etc. are all there and more accurate than Windows'~
I'll continue to watch it tonight~
I don't have a cuda option, a while back someone here on the forums suggested better performance by changing some sort of parameter in Windows. I forget the thread I read it in, and I have forgotten the parameter I changed. But you will notice I have a 3d option and when I click the drop down, cuda is gone. It got replaced by a 3D graph. I was told in this Daz forum thread that letting Windows handle the cuda cores gave some kind of huge performance boost. Maybe some will remember more than I do and link that discussion to this thread. I would test out that as well in a future video. Thank you very much Crosswind for watching some of my video!
I apologize for shamelessly plugging my YouTube channel here in this forum, but I do not make a dime from my channel yet. It is not monetized, and I have been making free Daz videos for over 5 years on YouTube with zero income derived from this effort. So, if people do subscribe, click the bell for notifications and plz watch my videos. I may one day meet the criterion to actually get paid something for my efforts there.
I have the kind of job that I would do anyway even if I am not paid for it.
I must get 1000 subscribers (almost there) and 4000 hours of viewing time in a year (and I am only at 1000 so far).
Daz Studio advanced users (and beginners), come be part of the live video chat and you can share your knowledge as well. I love learning too!
Best to you!
Sounds great and keep going with it! I also make free tutorials for hobbyists, just for fun ~
Hey Crosswind, please share a link to your tutorials here. It is all about learning and having fun while doing so. Best of luck to you!
Rex, unfortunately those Daz tutorials are not in English~ I replied to your message with a link there under your video, you may just take a look~ Cheers!
Well peeps,
Nvidia just put out a new studio driver, memory pooling is still not fixed.
Windows just put out an update, same result no working memory pooling for us 3090 Daz supporters.
I am beginning to think this is a Daz Studio problem...
Windows is showing my cards have their memory pooled. It just does not work in Daz when rendering.
Test if NVLink is working for vram pooling?
https://www.daz3d.com/forums/discussion/353011/test-if-nvlink-is-working-for-vram-pooling
This is an interesting read; it seems Microsoft is the reason why memory pooling is broken.
I wish they would fix this already and give users the option for this feature on their workstation.
After all, a computer is not only the OS but thousands of dollars worth of hardware and software as well, and in light of this, Windows is a small part of this equation.
Maybe Unix is the way to go… I hear Microsoft has bought up a lot of stock in Unix as well.
And why can’t system ram be utilized as well to hold 3d scenes? Why can’t Windows pass off rendering tasks to secondary PCs as well? It is probably possible but not implemented as well.
Why must whole scenes be loaded into each card while the PC sits idle? What a waste of hardware.
A dark time in 3D rendering history.
Eventually graphics cards with hundreds of GBs of graphics memory will remedy this conundrum.
NVIDIA’s quad-slot GeForce RTX 4090Ti/TITAN 800W graphics card has been pictured
https://videocardz.com/newz/nvidias-quad-slot-geforce-rtx-4090ti-titan-800w-graphics-card-has-been-pictured
Now one must wonder, is 48gb enough?
(96 would be better.)
The thing is, I have nearly 4tb of models on a solid-state drive.
Making a scene does not create a 96gb project file... It renders out perhaps at best a 10,000 x 10,000 pixel png while only utilizing and saving pointers to data already existing to make a scene.
How many CUDA cores would be needed to render a 96tb project @ 10,000 x 10,000 pixels?
Just wanted to revive this link as I was trying to get my two 3090's to utilize "memory pooling" via NVLink.
Nearly five months after JasonWNalley findings where memory textures are corrupted... I think nothing has changed. So whatever Windows 11 updates and NVidia "Studio Drivers" updates took place..... it's still broken.
-Used 3-slot NVlink designed for Nvidia A6000 and Ampere cards.
-Plugged a second HDMI display into the second GPU (pugetsystems.com says is necessary on RTX 30x0 cards)
-Set "Maximize 3D Performance" on then rebooted
-Set "NVLink Peer Group Size" to "2"
-Have both cards checked on
Anyone else have any success or find a work around other than creating a render farm in Linux for the cards?
I could only say I'd never seen this issue with my Turing Quadro x 4 + Nvlinks 2.5 years ago but I saw this issue in Rex's video and your pic. I ever thought it must've resulted from the driver with 'pooling mode'... but had no way to test as I only have one A6000 +old Quadros.
Have you submitted a ticket to Nvidia?
Instead of waiting for updated/fixed drivers, I would try going back far enough to find working ones.
My progress has not been progress at all. Just frustration.
Seems I can't find a driver that works and still allows dforce to continue working.
Anyways here's another image to show the weird texture problem in the hair and his face. On the earlier image I posted the skin jumps between looking wet and flat in weird stripes. It's images like this one that is cropped that push my VRAM to the max of my primary display and over if I open a browser or other programs. I can only imagine as textures become more and more 8K the situation will get worse.
I think I'm going to give up on this. Or look into creating a render box but what a pain in the butt!
Dont mean to necro an old thread but was this ever Solved, Confirmed after any Daz/Nvidia update since June?
Does VRAM Pooling work in Daz ?
If yes, How, what is needed, what settings.
If No,
I don't think Daz or Nvidia are to blame for this not working, it is Microsoft that has decided that memory pooling should not work in any version of Windows.