Welcome to Our Community

Some features disabled for guests. Register Today.

Sporadic Stepper Stalls

Discussion in 'CNC Mills/Routers' started by Batcrave, Jan 31, 2020.

  1. Batcrave

    Batcrave Journeyman
    Builder

    Joined:
    Apr 20, 2018
    Messages:
    361
    Likes Received:
    165
    I've been having some trouble with my half-width Lead 1010 (so is that a Lead 1005? Lead 0510? Lead 1010/2? Eric the Half-A-Lead?) lately, and could use some input. I suspect the problem's always been there to some degree & was just rare enough to overlook or put off dealing with, but it recently jumped out and bit me as I was jogging into place to do some final cuts on a workpiece that already had a day or two's work invested, making me hesitant to do anything else until I've finally pinned down exactly what's happening & why.

    I'm running into occasional - but infrequent - stalls on all four steppers (not at the same time). It doesn't seem reliably tied to any particular speed, position, or acceleration setting - sometimes when first starting to move an axis (as could be caused by too high an acceleration), but other times well into a long move, after the acceleration should be completed - and it happens (or at least can happen) when a single axis is being used.

    I've tried a lot of things over the past couple weeks & written off a lot of others - I can go into more depth later, but I'm going to hold off on listing them for the moment. Not that I'm fishing for opportunities to repeatedly say "I already tried that", but because One, I could use some fresh brains on this that aren't already clogged up with my mistaken assumptions and unfounded conclusions, but mostly, Two, because the post was getting so **** long no one ever would've made it to the end to find the "reply" box.


    -Bats
    (yeah, this is the short version)
     
  2. David the swarfer

    David the swarfer OpenBuilds Team
    Staff Member Moderator Builder Resident Builder

    Joined:
    Aug 6, 2013
    Messages:
    3,238
    Likes Received:
    1,815
    my bet is EMI, maybe static discharge from a vacuum hose, or noise from the router through an intermittant connection.
    what controllers/drivers?
    what software?
    good USB cable?
    good star ground?
    laptop or PC? PC is normally grounded, laptop not, so this could cause a ground loop.

    max speeds? any sticky spots in the leadscrews or wheels? dirt on wheels? if you slow down all accelerations and all max speeds by 25% and run a test job, what then?

    heat? drivers will shutdown if they get too hot, but sometimes just for a moment as they hover on the edge of 'too hot' so all that obvious.
     
    Batcrave likes this.
  3. Rob Taylor

    Rob Taylor Master
    Builder

    Joined:
    Dec 15, 2013
    Messages:
    1,470
    Likes Received:
    746
    ^All of the above. I've been dealing with EMI issues for a few weeks, it's no fun. Upside though, there's a lot of fairly inexpensive shielded wire out there.

    Also,

    What voltage?

    Are your current settings high enough (if it also happens during cuts)?
     
    Batcrave likes this.
  4. Batcrave

    Batcrave Journeyman
    Builder

    Joined:
    Apr 20, 2018
    Messages:
    361
    Likes Received:
    165
    Ok, I guess that's my first "I thought of that, but...". I actually was having EMI trouble with the limit switches (probably still am, but covered it up with debounce settings). The spindle cable is shielded, but the VFD isn't, which throws off a ridiculous amount of noise - but for testing purposes I've got the VFD (and even the spindle's LED ring) unplugged, and have been testing only a single axis at a time (although I didn't think to try it with only the single motor plugged in yet).

    I was also worried that the weight/inertia of the monster spindle was contributing, so I pulled that off too (I actually just pulled off the whole Z axis assembly - right now I'm running tests on the X axis carrying nothing but the plates-n-wheels assembly).

    The vacuum's still plugged in, but hasn't been running & isn't mounted on/attached to/in contact with the machine.

    Gecko G540 on a parallel port.

    Mach 3 on XP. I need to take another crack at switching to LinuxCNC, but it's probably not happening right away.

    Not really, but the only thing attached via USB is a PS3 gamepad. On the prior PC I discovered it was acting as an antenna for the VFD noise and causing crashes (of the bluescreen variety), but I haven't had any trouble with that since putting together the new box.

    Umm... sorta? :p

    I've got star-grounded shielded cable on the limit switches (although I'm not positive I actually got around to reattaching the grounds this last time) to the main controller/driver DC ground (which is also tied to the PSU ground), although I haven't seen any evidence of the switches triggering.

    The spindle cable is shielded, but goes to the VFD ground (I've never been entirely sure how to handle that situation with regard to star-ground shielding layouts), but is currently both detached and unpowered.

    The stepper cables are ungrounded, but I was under the impression that was fairly standard anyhow.

    I suppose the AC grounds could be another story...

    Desktop (well, rackless rack-mount?). I do have it running off a different circuit than the rest of the machine, though, so that the spindle and/or other power tools blowing a breaker won't take the PC down with it.

    No sticky spots or stuck wheels - able to run the screw from one end to the other by hand with very little resistance and, as mentioned above, I've stripped off pretty much all the weight. I completely loosened the wheels and still didn't see any difference.

    Feed-wise, I don't start seeing guaranteed (or frequent) stalls until I'm up somewhere well over 250ipm, but I still see the sporadic ones down below 100ipm. I assume the more consistent high-feed stalls are where I'm hitting the practical limits of the machine/motors/electronics, but the fact that the occasional ones persist at much lower speeds is what got me stuck on the idea that there was something else going wrong.

    Similarly, acceleration doesn't cause consistent stalls until I get around 55in/sec/sec, but I still see a lot of the occasional ones down around 25-35 (I've seen a few as low as 15in/sec/sec, but haven't done a lot of testing down in that range).

    The "test job" is the problem. I have yet to come up with one that triggers it frequently enough to be conclusive. The one I'm working with right now G0 rapids a single axis back & forth by 6", then a few times by .25", then a handful each at .05" and .02", then M47 loops endlessly, but if it doesn't stall on startup or in the first couple moves (like at especially high feed/accel settings) then I can often run it for a half hour, only to have it stall when I hit stop and try to jog instead. I've got another version with a G04 dwell between movements that I was hoping would replicate conditions if it were some sort of initial-current-draw-type issue, but that doesn't seem to be any more consistent.

    I should probably also mention, one of the things that got me poking at the axes in the first place is that I'd occasionally get a loud "clunk" (like when a stepper first powers up & locks into position) in the middle of rapids (even 10-20" long movements with no speed/direction changes). It may or may not be related - I never did figure out what causes it - and it never seemed to cause any problems in itself (no step loss), but it was always a little concerning.

    I've worried about that in the past too, so I've been trying to watch the temps, but the Gecko is supposed to be good for 70°C (as measured on the back of the case), and I have yet to see it over 50°C. I've seen the steppers top 80°C after long periods of testing, but I got the impression they're used to running hot, and the stalls seem to happen just as often when they're cold (earlier I stalled on my first jog, only a minute after powering up the system for the day, at 150ipm and 15in/sec/sec).

    I'm running on a 36V / 10A / 360W PSU (one of these S-360-36 units).

    I assume it happens during cuts - but I've been lucky enough not to see it. I haven't done a lot of cutting since replacing the PC, though, and on the old one I was having so many problems, it's hard to say which of them this might have been responsible for (plague-era capacitors do all sorts of interesting things when they finally give up and start vomiting their guts).

    As for current settings, that's something I hadn't even considered. The G540 uses current set resistors soldered across a couple pins in the DB9 (ok, "DE-9") motor connectors, which I hadn't given a thought to since setting them up five years ago. It looks like I've got ~2.975k resistors for 3A motors (three KL23H256-21-8B and an OB "high torque" on the Z), though, which should be about right.



    One other note - I seem to vaguely remember seeing something similar to this with my previous machine having to do with resonance - leading to a lengthy investigation into dampers and flywheels - that may have been eventually solved with vibration damping stepper mounts (memory is fuzzy and notes are missing)... but the mounts aren't compatible with the Lead unless I mod/make new end caps for the C-beam, and I've been reluctant to get into the project of building fancier Rattler-style torsional dampers unless I'm pretty **** sure (ok, fine, "pretty ******* sure") it's really a resonance-related problem.

    I did swap the helical flexible couplers for some jaw-type couplers with rubber spiders & shove some rubber washers under the stepper standoffs to see if it would make a difference (it didn't), but obviously the washers don't completely decouple motor from frame the same way the mounts would.


    -Bats
    [ witty postscript severed by runaway spindle after missing steps ]
     
  5. Batcrave

    Batcrave Journeyman
    Builder

    Joined:
    Apr 20, 2018
    Messages:
    361
    Likes Received:
    165
    One more thing that's probably worth mentioning (especially since some connectors came loose and fried one of my drivers back in December) is that for the purposes of the test, the wiring is all basically stationary. Vibration aside, neither the motor nor the wires have to move anywhere, making a loose wire or intermittent connection seem less likely.


    -Bats
    (that's your problem, Bats - it's all stationary! everyone knows paper is a lousy conductor!)
     
  6. Rob Taylor

    Rob Taylor Master
    Builder

    Joined:
    Dec 15, 2013
    Messages:
    1,470
    Likes Received:
    746
    If you're using a built-in parallel port, makes me wonder if it's a weird driver/buffering/whatever internal issue to the computer. I don't think, instinctively (which means very little) that it has anything to do with the machine components, more that something within the PC doesn't like itself. Which doesn't make a lot of sense, but there's nothing about your machine setup that should be giving you this issue unless you somehow, miraculously got a bad G540.

    Hmm... What exactly do you mean when you say "stalling out?" As in, the steppers grind to a halt and make an incredibly obnoxious noise for the rest of the calculated move time?

    Do you have a junk desktop-desktop laying around? If not, I grabbed an ex-corporate no-HDD Dell Optiplex i3 for $30 shipped from these guys when I ran out of testable computers for LinuxCNC: Southeastern Data | eBay Stores - throw a $20 SSD and $5 parallel card into it, it's good to go. I ended up getting a Mesa 7i76 PCIe FPGA card for it, which is generally awesome because all the stepgen happens on the card, no heavy lifting from the computer (and gives me access to PNCConf), but a lot of money ($100ish?) if you're still trying to just make the system work. Obviously not ideal when you just bought a new machine, but...

    Without differential A/B testing, I'm not sure there's an easy way to narrow down an intermittent problem like this. If you do decide to go LinuxCNC, I can walk you through it- a basic stepper setup can be cutting chips within a couple hours of plugging the machine in, no manual programming required. Just gotta ISO To USB an image of LinuxCNC, which is a custom Ubuntu 18.04. Grbl would also work for testing and continue working with XP, I imagine, as long as one of the senders can work with it. bCNC is Python 2.7, so that might work. I assume you're not connecting an XP machine to the internet, or I'd suggest OBCONTROL.

    I suspect that the problem might be some kind of latency/catch-up issue with the software stepgen, so anything to get it off that machine and/or OS might be useful in diagnosis.
     
  7. Batcrave

    Batcrave Journeyman
    Builder

    Joined:
    Apr 20, 2018
    Messages:
    361
    Likes Received:
    165
    Ugh. I wrote a lengthy (yeah, yeah, I know...) reply to this on Saturday, and it looks like the forum ate it.

    I even made a backup in notepad++... which I then deleted after it seemed to successfully post.

    No wonder no one was responding.

    *sigh*

    Guess I get to start over.


    -Bats
    (and here I was worried it was because I didn't shower this morning. I guess maybe I should be relieved)
     
  8. Rob Taylor

    Rob Taylor Master
    Builder

    Joined:
    Dec 15, 2013
    Messages:
    1,470
    Likes Received:
    746
    *Waits impatiently*
     
  9. Batcrave

    Batcrave Journeyman
    Builder

    Joined:
    Apr 20, 2018
    Messages:
    361
    Likes Received:
    165
    It's gonna be a while.

    But in the meantime (on the only-possibly-related topic of resonance), I ran into this article recently, which was mostly familiar stuff, but brought up an interesting point that I hadn't realized:

    The resolution of microstepping drivers often drops as the rotational speed increases; the reduction in resolution is inevitable due to the limited bandwidth of both controller and driver. For argument's sake, imagine a designer attempted to operate a microstepping controller with 40,000 steps per sec at 50 rps (3,000 rpm.) It would then have to output 2,000,000 microsteps per sec to keep all the steps. Even if this were possible, a typical PWM driver only operates at 20 to 40 kHz — so the fine interpolations would never reach the motor. To address this inability to hit every microstep at higher speeds, the number of microsteps per second is often reduced as the motor speed increases. Transitions between these different resolutions can cause an impulse in torque to the motor, causing ringing that can result in lost steps.

    I'd thought the G540/G250X was permanently fixed at 10 microsteps, but according to their site:

    The G250X, the heart of the G540, features proprietary anti-resonance and motor smoothing techniques. At a native ten microstep resolution the G250X does not require extremely high input frequencies to operate at high speed but offers enough resolution to match the accuracy of most stepper motors (+/- 5% of one full step). Full step morphing at higher speeds transitions the drive to send a true square wave when the benefits of microstepping are no longer present. An adjustable low speed smoothness trimpot compensates for motor nonlinearity at the low end while midband resonance compensation ensures reliable torque output through the midband dropout region.

    Of course, I thought that "midband resonance compensation" of theirs was supposed to cancel out most resonance-y stalls, so this may be just a red gecko herring.

    Also interesting (and also not necessarily related) in the Machine Design article is something I was sort of intuitively aware of - and was one of the first things I suspected of being responsible for the "clunks" - but don't remember having seen anyone explicitly mention before:

    For example, gear trains release the load when changing direction, due to backlash. While the load is uncoupled from the system, the motor accelerates (because of lower inertia) until the backlash has been taken up. When the gears engage again, the difference in velocity between the motor and load can reflect excess torque back to the motor. Thus the system cycles: The motor slows below the speed of the load, again the load decouples, and then the motor speeds up. In some cases, the change in speed may be enough for the gears to first strike on one face and then rebound and strike on the opposite side, to repeat several times. The exact timing of the reversal ringing may vary with both the position of the gear train and with the wear of the gears, making it difficult to choose a stepping sequence that compensates.

    Obviously the lack of gears in the Lead gives it less room to happen, and after checking the nuts & replacing the collars, I think I've got the backlash pretty much eliminated from the screw. I have trouble imagining there's enough flex in the motor mount to cause trouble, either (especially without everyone else facing similar problems), but it still seemed interesting enough to offer as a stalling tactic while I try to remember what I actually wrote in the other post.


    - Bats
    (shhh! not now! I'm typing!)
     
  10. Rob Taylor

    Rob Taylor Master
    Builder

    Joined:
    Dec 15, 2013
    Messages:
    1,470
    Likes Received:
    746
    Since it seems to tend to be on the lower end of the speed scale, have you tried adjusting the G540's smoothness trimpots? In theory, that's what they're for, but I don't have one (yet?) so I don't know much specifically.

    The clunking sounds like it may be related- the teeth aren't quite finding the spot where they need to be fast enough and are skidding around as the field moves.
     
  11. Batcrave

    Batcrave Journeyman
    Builder

    Joined:
    Apr 20, 2018
    Messages:
    361
    Likes Received:
    165
    Nope- I thought the same thing for quite a while, but those are for the very low end of the scale - think single-digit IPMs

    That's certainly what it sounds like - since it's exactly the same sound as when the motors first power up and lock into position (although isn't that basically the same description as the "ringing" that's at the heart of resonance issues?). Oddly enough, it also does it at the end (or beginning?) of a program when it hits an M47 and loops back to the start - although it doesn't happen before or after G4 dwells (of any length).

    After some work last night, though, I'm pretty **** sure at least part of the problem is software/system related - just apparently somewhere outboard of Mach 3, where it's not reflected in any of the resource monitor-ish DROs. [side note: I'm also pretty **** sure I need a row of little bat icons to appear instead of ****s every time I say ****. Openbuilds really should get on that]

    I was trying to get a more granular look at what was going on with system load - to see if maybe there were brief spikes that I didn't see in XP's Task Manager - so I fired up HWInfo32 with a 500ms sample time while sending a (uncoupled) stepper on a nice, long, rapid (500 inches, or so), annnnd.... it started stalling approximately twice a second.

    These did sound to be longer than the "clunks", but they weren't the lock-up-until-the-next-move variety - they seemed to recover quite quickly and the move would continue (although I didn't have an easy way to see whether or not steps were being lost). I had been running Speedfan with a long polling time to control my temps (both the rackmount case and the shelf it's living on have lousy air circulation, which means it needs noisy fans - so I try to keep them low when the load & ambient allow), so now I'm suspecting that may have had something to do with it.

    Of course, I also discovered I could get a similar, if less predictable, effect by firing up something resource-hungry like Firefox - and (as far as I know) FF should live mostly in userland and doesn't use the sort of low-level hardware polling that lets HWInfo cause so much trouble.

    I know onboard GPUs also have a bad reputation in CNC circles (and I don't currently have a good way to monitor GPU usage - have to see about digging up an over version of Afterburner), so I may have to add a low-profile video card to the test hardware shopping list

    So I'm still not sure if it's the problem (and there are still a few points I want to follow up from your earlier post), but I've definitely found a problem to try fighting with tonight. Which, at least for the moment, is better than fighting phantoms.


    -Bats
    ( probably safer, too... after all, The Phantom was a notorious ***-kicker )
    ( see! those should be bats! )
     
  12. Batcrave

    Batcrave Journeyman
    Builder

    Joined:
    Apr 20, 2018
    Messages:
    361
    Likes Received:
    165
    Quick update on the system-y issues - I'm becoming more convinced that the clunks and stalls are part of the same problem.

    Using MSI Afterburner to get a more granular look at GPU & CPU activity (a 100ms polling rate which - unlike HWInfo - didn't cause any trouble, interestingly enough) didn't show anything that seemed to correlate - at least while the system was idle. There's still the clunking/stalling under CPU load issue I noticed earlier, but I don't tend to go launching browsers while I'm cutting, and I got distracted by another line of investigation - so, at least for the moment, shopping for a discrete GPU is off the "urgent" list.

    Following up on last night's tests, I found a way to reliably reproduce the clunks - or at least something that looks and sounds a hell of a lot like them.

    From my notes:
    • All of these tests were all done on a mounted-but-uncoupled motor during long (500in) transits at a variety of feed rates, so acceleration shouldn't be a factor. Resonance may be, but probably would probably look different than if it were driving a screw.
    • HWInfo32 causes a series of rapid clunks on startup, then, if Safety->CPU Clock Measurement->Bus Clock-based->Periodic Polling is enabled, it clunks once per polling cycle (down to ~2sec), even if all sensor monitoring is disabled.
    • In the neighborhood of 135-140ipm (~420-440rpm) those clunks stretch out longer. Then, at around 140+, they become unrecoverable stalls (like before, of the noisy torqueless variety).
    • Mach 3's Blended Speed and Pulse Frequency DROs both show very slight dips (that I overlooked before - like the proverbial four-leafed clover... but less lucky) as the motor clunks. Even at feeds where the stalls become unrecoverable, the dips are only momentary.
    Now here's where it gets interesting...

    Without
    HWInfo32, Speedfan, or anything except Mach 3 running, the motor will clunk twice, about once a minute, approximately a second apart. The period isn't precisely a minute (so I'm guessing it's not a "run every minute" task), but it's awfully close. When I first spotted it, it would clunk at x:xx:15 and x:xx:17. By an hour or two later it'd gradually drifted to x:xx:21 and x:xx:23.

    These aren't identical to the HWInfo-induced clunks - they seem to be a little briefer/gentler, less reliable about turning into stalls, and show a smaller dip on the DROs (around 0.3-0.4ipm, instead of ~3-4ipm) - but I'm not sure how significant that is.

    I guess the next step is to go through and kill off every system process one by one until either the clunks stop, or XP does.


    -Bats
    (and if my PC thinks that last bit sounds like a threat... well... it wouldn't be wrong)
     
  13. Rob Taylor

    Rob Taylor Master
    Builder

    Joined:
    Dec 15, 2013
    Messages:
    1,470
    Likes Received:
    746
    This is pretty good sleuthing! Are there no ancient forum posts online somewhere about this? Seems like XP/Mach 3 would have been used a lot back in the day, and someone would have run into this.

    I think I vote LinuxCNC though. :D

    That would be amazing. While they're on, they should make media embeddable from Instagram. Facebook and Dailymotion? What is this, 2014?!
     
    Peter Van Der Walt likes this.
  14. Batcrave

    Batcrave Journeyman
    Builder

    Joined:
    Apr 20, 2018
    Messages:
    361
    Likes Received:
    165
    I imagine Mach 3 + XP is still a heavily used combination - Mach 4 is supposed to be a bit of a [bat][bat][bat][bat]-show, plenty of people can't be bothered to replace the OS on anything but their primary machine (if even that), and I've seen posts from countless users who decided it was easier to just yank their CNC box off the net, rather than bothering to deal with the hassles of replacing/upgrading it & working out the bugs ([bat][bat][bat][bat]ing stalls *grumble*) all over again. And then there are the idiots who just grabbed an old system they had lying around and were too [bat][bat][bat][bat] cheap to throw another $100 at a new OS license for their >$1-2000 machine. Like me.

    The fact that this problem isn't right at the top of every "How to Install Mach 3" on the web, though, suggests it's probably not always a problem when Mach 3 and XP come together - and it could even be something specific to the way XP interacts with this particular motherboard (which kinda postdates XP anyhow).

    Hopefully I'll be able to say for sure once I narrow down exactly what's causing it. Nailed it!

    (and, however unlikely, I hope this eventually proves useful for someone)

    When I started on this reply, I'd gotten as far as tracking it to a svchost process (the one running as the LOCAL_SERVICE user). The next step was to pull out Process Explorer & see just what svcs it was host-ing.

    It turns out that particular svchost handled the TCP/IP NetBIOS Helper, the SSDP Discovery Service, and the UPnP service that depends on it, none of which are particularly critical (or should arguably be off anyhow) which makes life easier, but just to be thorough (and because I wasn't going to be getting anything else done tonight), I narrowed it down to SSDP Discovery.

    Now, I have no idea why SSDP would be causing hiccups in net-unaware software and/or parallel ports, but disabling the service stops the clunks, and I just ran one (uncoupled) stepper on a long/short/v.short zigzag toolpath at 300ipm,100in/s/s for a completely clunkless ten minutes before seeing the first stall (presumably due to slamming through ten minutes of direction changes at unhealthily high acceleration - not something I'd consider with ten pounds of brick spindle riding on it).

    It remains to be seen whether that was the only problem (who am I kidding? there are always more problems), but I think it was the big one. I did end up seeing some clunk-y behavior from a number of other activities, but most of them were things I wouldn't be inclined to do while the machine was running anyhow. The only one that worries me a little - because it was such an innocuous action - is that Right Click->End Process in Task Manager (not ending a process, just popping up the dialog) tended to clunk, which I wouldn't have expected... but I don't see it from other dialogs or context menus (and, again, I don't tend to leave Mach 3 when the machine's moving), so, like the even worse clunks when opening browsers, I'm not going to sweat it for the moment.

    Thanks for the help (David, too) - I probably made more progress talking it out over the past couple days than over the previous couple weeks struggling in silence. Or struggling with it in a not-at-all-silent cloud of profanity.

    That's still the direction I'm looking at going in the long run, but as I mentioned in the post you never got to read (which is a pretty great excuse for anything else I want to claim I'd already thought of), while I got LinuxCNC more or less working (if far from a usable form - that stock interface is pretty barren), I was running into X-Windows headaches that I ran out of time to fight with. For some reason the screen buffer would periodically overwrite everything with a collection of old windows, making it impossible to see what was supposed to be there without forcing a screen redraw.

    The one thing that's really been frustrating me about the Mach3/Linux choice, though (not to be confused with Mach 3.0 linux), and the problem of moving away from software stepgen in general, is that any hardware path beyond a parallel port that I've looked at pretty much means being locked into one platform or the other. LinuxCNC's preferred Mesa boards don't work with Mach 3 (or 4, iirc) and LinxCNC doesn't deal with options like the Smoothstepper or UC100. Otherwise I wouldn't have even bothered tracking down a motherboard with an LPT header.

    Tangential question: Have you ever tried getting PathPilot running on non-Tormach equipment?

    I'm not sure I realized Dailymotion still existed. Did people even use it in 2014?


    -Bats
    ([bat][bat][bat][bat] it, Mark! I'm going bat-[bat][bat][bat][bat] without my [bat][bat][bat][bat]ing bats!)
     
  15. Rob Taylor

    Rob Taylor Master
    Builder

    Joined:
    Dec 15, 2013
    Messages:
    1,470
    Likes Received:
    746
    True! Sometimes all this "maintaining a PC" stuff is just too much effort. I don't know how many of them are likely to be posting bugs on forums, though, I would expect them to be running pretty consistently by now. I think in an ideal world the CNC controller wouldn't be networked at all, but manually moving NC files around is just the worst.

    Oh yeah, new hardware with old software always seems to throw up some kind of weirdness.

    Yay!

    Seems like you should be able to turn all of those off in a non-networked, hardware-locked machine without any issues at all. Win10 disagrees, of course, but it would:

    upnp.JPG

    SSDP says a similar thing. I guess if you manually install drivers for any UPnP-based devices (including all the "devices" that are actually just part of the motherboard), and don't network anything... It should be alright? Assuming they're available, of course, and not just relying on UPnP to work.

    I'm guessing it clunks because it throws a high-priority process through the queue every so often, or something along those lines, rather than actively interacting with the parallel IO hardware, but if the parallel port is a "UPnP device" even though it's mounted to the motherboard... I guess it could be the latter. Could you simply de-prioritize the service process, somehow? I don't recall XP having as much granular control as later versions, but it might be sufficient to keep everything happy and vaguely functioning without worrying about something destabilizing the OS halfway through a job, I dunno.

    I wouldn't think that the parallel hardware WOULD be UPnP, should be Legacy Plug and Play (the one that never worked, remember?), but maybe the parallel/serial controllers are part of one module or something, I have no idea and I'm just talking [bat][bat][bat][bat].

    Really does sound like a process priority issue, like Mach 3's getting kicked out of the queue every time you look at something "important". It's weird.

    I need to learn to talk stuff out more too, I'm horrible about that. Considering the caliber of the people here, it's pretty silly.

    You really do find the weirdest errors to have. Never come across that one with any version of Linux I've ever run on any device, so...

    Shame Mach 4 didn't include the Mesa boards, they're really nice. Seem simple enough that you could potentially DIY them, too- Mesaflash just dumps the stepgen/comms firmware on the FPGA, the rest of the card is just PCIe IO, it seems like. I don't know if they put out a reference design for any of this stuff, but they seem to have enough commercial customers that I'm not super concerned about them going anywhere. As RISC and FPGA rise in popularity, the number of options for that kind of thing will probably skyrocket.

    The more I think about it, the less I enormously care about platform lock, either. It's not like any of us are just gonna sit and engineer grbl or LinuxCNC from scratch. They're black box tools that we use and are at the whims of to the tune of thousands of dollars. More competition is largely good, but I'm not for it to the point of fragmentation. I'd rather have the pool of talent working on a smaller, higher-quality set of options.

    Nope. I don't have a huge issue with LinuxCNC's UI, I know there are projects out there to replicate it, but since it's now just a skinned LinuxCNC it seems pretty pointless. I found LinuxCNC to be fairly plug-and-play, not too far off grbl, in terms of getting up and running with the basics. There are more complex things that require HAL and all that weirdness, but they're things that most other control software can't even do, so it's a net win.

    I think it existed still, but YouTube had eaten its lunch by then. No idea if it still exists, didn't bother checking. :D
     
  16. Batcrave

    Batcrave Journeyman
    Builder

    Joined:
    Apr 20, 2018
    Messages:
    361
    Likes Received:
    165
    Oh, absolutely. Aside from the occasional "everything worked perfectly until I had to replace a dead [PC component], why's it breaking now?", the information has been pretty much static for years... but so has the software and OS.

    Stuxnet would be so disappointed to hear you say that (although I seem to remember that being carried across airwalls on infected USB anyhow)... But, yeah, I've done the gcode sneakernet thing, and it sucks. Realize something needs to be tweaked, run back to the main PC in the other room, tweak, run back & get the USB drive you forgot in the CNC, back to the main PC, copy, run it back to the CNC again... only to discover there's something else wrong & you do it all over again.

    It's bad enough having to run back to the CAD/CAM on the desktop in the other room - but even if the CNC box could handle the software, it's not somewhere I'd be comfortable working on for more than five minutes at a stretch - so, network it is.

    Win10 would suffer a dramatic fainting spell if you even suggested the possibility that a machine might have to exist off a network.

    I don't think there's anything onboard that should require UPnP, and driver installation was done before I even installed Mach 3 (and I [bat][bat][bat][bat]-sure wouldn't be installing any new ones while the machine was cutting *shudder*). SMB file sharing doesn't seem to have any need for it either, which is the only netty thing the system needs to handle - it's certainly not going to be forwarding any open ports or playing host to random IoT gadgets, or whatever actually uses UPnP these days (maybe chromecast the Mach3 display to the smart TV, so I can watch from upstairs, where I don't have to listen if a stepper stalls halfway through?). I'm not even sure I have it enabled on my routers (then again, I'm old fashioned enough that I didn't even bother running DHCP until a few years ago). [afterthought: I wonder if the Ethernet Smoothstepper uses UPnP?]

    If Gigabyte was crazy enough to route their parallel traffic over IP, then anything is possible.

    I know there were programs (maybe even something in the Sysinternals kit) that could permanently set process launch priority on XP, but I don't remember ever seeing anything similar for services. I'm not sure it would make a difference either (although I'm speculating without very concrete evidence), since boosting the priority on Mach 3 (something I'd tried early on) didn't seem to do anything. Also, if you remember HWInfo's rapid clunking, it only runs at Normal priority, but on startup it polls a whole [bat]load of hardware sensors, and that's the point where it clunks merrily away like a death metal drummer with a double-kick kit an inch away from an amphetamine heart attack.


    Oh, I fondly *cough*gag*sputter* remember the halcyon days of Win 95's "Plug and Pray" (which had the amazing ability to make manual configuration feel frustration-free in comparison). To give Microsoft credit, though, modern plug & play actually does tend to work most of the time - and it only took them a couple decades to do it, too! After all, when was the last time you had to figure out just the right IRQs and range of memory addresses to allocate to a new video card (or go through three different models to find one that doesn't conflict with your mouse)? Just don't give them too much credit, since their job got a lot easier once everything became PnP and entire systems started being designed around it, instead of leaving room for all those inconvenient "users" and their messy "choices". I'm also not sure how much of the modern plug-and-play configuration is actually handled by or directly descended from the classic MS Plug-n-Play.

    In the same sense, though, I didn't think UPnP was ever really a replacement for/extension of PnP - I thought it was basically just the same automagical config concept applied to networky devices and saddled with the same catchy name. I could be entirely wrong on this, though, and I couldn't (or couldn't quickly) find anything authoritative.

    I don't remember quite how motherboard buses are laid out (or how AMD laid them out in the socket-AM3 days) and I'm too [bat] lazy to look it u...

    Nope. As it turns out, I'm too lazy to get up and get the machine put back together again and it's easier to sit here and do a deep dive into hardware architecture.

    According to this:
    [​IMG]
    ...the parallel port & other Super IO devices are traditionally on the LPT LPC bus, off the southbridge. AMD's top sekrit AM3 reference design (made extra low-res for sekurity purposes) also puts the LPC & SIO down south of the southbridge:
    [​IMG]

    So basically 99% of the system connects somewhere upstream of the parallel port, so theoretically just about anything could cause hiccups if it managed to reserve, occupy, or otherwise DOS the NB/SB/CPU... but all three are designed to handle huge amounts of simultaneous traffic, so it'd take something big to swamp any of them (presumably the sort of "big" that would show up as exactly the sort of big spike in CPU load that I haven't seen any of).

    The more likely sounding alternative is it's something else on the LPC bus. That leaves us with TPM (Trusted Platform Module - never actually used a motherboard with one plugged in), BIOS, and the "Debug Postcard" (which I'm going to assume handles Power On Self Tests, rather than mailing back pre-Instagram "wish you were here" photos to make friends and family jealous) - none of which seem very likely - or else another Super I/O device.

    The Nuvoton/Winbond Super I/O chip is the same thing that handles most of the motherboard sensors (temp, voltage, fan speed), which explains why HWInfo is so clunktastic (Speedfan and Open Hardware monitor aren't, but HWInfo reads a wide range of sensors they can't)... but that that doesn't explain SSDP/UPnP causing trouble there.

    But then, Wikipedia also sez that Super I/O handles...(wait for it)... Legacy Plug and Play. So maybe there is more of a connection than just the name. That's where my research hit a wall, though. The best I can find is a dictionary site (of all places) saying "UPnP Is Not PnP" - nothing about whether the one piggybacks on the other, or any association between UPnP or SSDP and Super I/O or Winbond /Nuvoton chips.

    It's probably a moot point anyhow, since I also couldn't find anything suggesting a case where that machine would need UPnP. SSDP (which, while we've (I've) gotten sidetracked on UPnP, is actually the root cause) is even clearer cut. It's an IP-based network discovery protocol, and - having no need of discovering anything on the network but the SMB shares on my desktop (which use a separate protocol stack) - it's pretty clearly superfluous, aside from UPnP treating it as a dependency.

    Also, the fact that it lends its acronym to Cloudflare's "Stupidly Simple DDoS Protocol" suggests I'm probably better off with both disabled anyhow.

    Even after spewing all that techy stuff above, I'd still be pretty inclined to believe this, if not for the fact that boosting Mach 3's thread priority doesn't seem to have the slightest impact. Of course, I don't know how much impact the thread priority setting has on access to resources like superIO devices - where the devices aren't really meant for a lot of (or any) simultaneous access - anyhow, and I imagine most PC-ish devices wouldn't have any trouble if a signal was delayed for a fraction of a second, either.

    But if I take a minute to recklessly and irresponsibly speculate (I'm not at all sure I understand enough of the low-level operation of steppers / drivers to know if this is realistic)....

    If the stepper's running along at high speed (morphed to full-stepping, in Gecko's case) and the signal to the driver gets delayed just long enough that the next step pulse shows up a step or three's duration later than usual... A powered stepper can't freewheel along with the momentum like a lathe with the power switch flicked - it has to step or not step, right? Going from 450rpm to zero rpm and then back with no transition over the course of a handful of ms would make one [bat][bat][bat][bat] of a "clunk", and, if it's going fast enough, I have to imagine that could throw it out of step - metaphorically and literally. What I don't know is whether stepper drivers (or the G250X) do any sort of buffering. If they could, wouldn't that make the whole "realtime" thing far less of an issue for software like Mach 3 & LinuxCNC?



    It definitely helps - even just the act of trying to document and explain the situation (at least when the explainee understands the subject matter. I could spend hours explaining the issue to my mother, and it would help as much as explaining math to a chimpanzee using banana-based metaphors would help develop a new model of particle physics). It's also great to have the brain dump there to refer back to, when you happen to run into the same problem again a couple years later. I tend to put it off or avoid it too, though - mostly because trying to put the whole situation and troubleshooting process into words takes so much longer than "just lemme try one more thing...".

    It's one of my special talents. Back in the early or mid-90s, my first attempt at running a *nix system was a copy of the linux predecessor MINIX, that came (binary and source) with the Operating Systems: Design and Implementation textbook. It didn't go so well. I emailed the support address in the book, and ended up in a conversation with the author/developer, who said it was the first time he'd ever heard of the problem. Apparently no one else had ever tried running it on an 8088 upgraded to 286 by way of a ISA breakout board. And I suspect no one else ever has.

    "It's a shame" seems to come up a lot in relation to Mach 4 (although it may have gotten better - I haven't kept up). It always sounded like it was more ambitious and more polished than Mach 3, but that development had moved at a snail's pace and it just wasn't a good substitute (it also couldn't even handle parallel ports without buying an extra plugin, meaning it wasn't really a drop-in replacement either). Between that and the way feature development Mach 3 development seemed to dry up well before 4 with so many rough edges unaddressed (something I've seen ascribed to one of the main devs leaving), it really doesn't give me the faith to drop a couple hundred on it (plus plugin and/or hardware) when there's an alternative like LinuxCNC.

    Of course, the LCNC stock GUI is even more crippled than Mach 3's, but even aside from the other interfaces available (Mach 3 has at least three or four, a couple of which might even be useful), it looks like design/customization is going to be a hundred times more sensible.

    Someone could potentially DIY them. My idea of DIY electronics might involve an arduino, some LEDs and a bit of dead bug soldering or a breadboard, and my coding (which never did much hardware-level stuff to begin with) have atrophied so far that it took me more than a day just to puzzle out the playfully obfuscated javascript (javascript!) behind a certain easter egg (and I even skipped over what looked like the interesting functions).

    That said, yeah, they seem to have enough industrial adoption (Tormach certainly helps) that the hardware's not likely to disappear any time soon, and all the work with LinuxCNC means that software support is practical even without the company.

    While I agree with you to some extent, I tend to have philosophical objections to platform lock-in - among other things, because the less of a (practical) ability a userbase has to vote with their feet and move to a better competitor,the more a developer is freed of the need to worry about how well they're serving those now-captive customers.

    On a more practical level, though - and specifically when it comes to the Mach 3/Linux CNC issue - it's much simpler. I don't have any interest in throwing hundreds of dollars at hardware that's going to lock me into Mach 3 - which I've never been satisfied with - when I'm hoping to move away from it to LinuxCNC. And, conversely, I don't want to drop the money on something that'll lock me into LinuxCNC until I have it running & configured solidly enough to be confident it's what I want to use for the long haul. I expect I will be going the Mesa route (since, as you said, they do look like really nice boards) - it's just a miserable time for it, and I'm grumbling because both sides are making this transition period difficult for me.

    Are we talking about this same UI? :
    [​IMG]

    ...or do you have something different set up?

    Because while Mach 3 is pretty hard on the eyes and is known to make grown UI/UX designers cry, I guess maybe I'm just used to having a bit more in the way of information and controls available. PathPilot is a skin for LinuxCNC, but from everything I've seen it's a skin that exposes significantly more functionality than the default. Whether it's necessarily better than any of the other skins available, though, I don't know. I didn't do much research on that front, since I was trying to get "functional" sorted before investing much time in "usable" or "enjoyable".

    What were we talking about? Daily something? I think I might've heard of them once. Weren't they a plumbing supplier or golf commentary site or something?


    -Bats
    (Daily Mail? no, wait, that's what plumbing is supposed to carry away)
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice