Did you have to replace your MCU?

Did you have to replace your MCU?

Hey all,

I've run into what seems to be a known issue -- around 4 years after getting my Model S, the MCU now has to be replaced due to the burned out eMMC (flash memory) - caused by high volume writes (by syslog in the Linux subsystem). And since this chip is soldered to the motherboard, Tesla will charge between $2200 and $5000 for a new MCU. I was personally charged $3200.

I'm actually surprised, after reading all the forum threads describing this particular issue, that nobody has thought about suing Tesla for sabotage. Clearly, they know where the problem is coming from and obviously, they know how to resolve it, but it seems that charging owners that have no warranty anymore (coincidence?) is a good way for them to have a steady cash flow.

Let me know if you could share your own experience with the MCU replacement on your car.

And once I gather all the documentation and hopefully, signatures, I'm going to book some time with my lawyer.


NKYTA | 2019年9月18日

Sabotage? Really?

jimglas | 2019年9月18日

Sue for electronics failure?
Really? | 2019年9月18日

@ruslan - There may be some truth to parts of what you describe it, but as an engineer, there is also a lot of loaded conjecture you have assumed are fact. Flash memory doesn't' burn out. It does have a finite lifetime. While MCU failures are not common, they can fail for a host of reasons - like any electronic product. It may be due to the flash memory, a CPU failure, RAM or any of hundreds of parts in the MCU. It's sort of amazing electronics are so reliable. Now unless you did an analysis of the exact internal failure before replacement, you've made a lot of assumptions that I dont' see how you could prove.

If you're concerned, you could have purchased the extended warranty. Products - no matter if an iPhone, a car, or TV, if it falls out of warranty, I don't see how a lawsuit would be successful, no matter how you attempt to frame it - even if caused by a poor design.

ruslan | 2019年9月18日 - Appreciate your expanded reply! I may have let my feelings interfere with my typing...
Being a software architect with my earlier years spent dealing with both hardware and OS, I fully agree with you that there are tons of reasons why an MCU may fail.
In my particular case, there is indeed nothing that proves the fact that the black screen is caused by the dead flash storage. But I'm ready to work with Tesla in obtaining more information on the issue.

What is bothering is the fact that Tesla is apparently aware (fast-forward to 8:40 that with an average user, the storage _will_ get worn out after 4 years or so. There's an SD card on the motherboard for archived data (as per the video) yet the main storage used to operate the OS is a soldered memory chip. There are plenty of industrial modular flash memory options out there that Tesla could consider. However, I think for them, the cheapest option was to go with motherboards that already came with soldered storage.

Being aware of the memory wear, I think we can agree the Tesla engineers knew that at some point, the chips would lose the ability to write.

iPhones and other electronic devices that use flash storage (and SSD) are indeed affected by the same issue, however their write cycles cannot be compared with the extensive non-stop logging on Tesla vehicles. | 2019年9月18日

@ruslan - Yep, saw the video months ago and still feel it's is conjecture on the failure modes. He may be right, but there is a lot of assumptions going on. It also doesn't mean Tesla is "apparently aware" because a hacker examined it or that Tesla knew of the issue in 2009 when it was designed. What we don't know is the amount of data actually being written over a period of time and the design life of the part. I'm not sure Tesla even knows. They have moved on to a newer design with MCU2 (which has a larger flash memory). The larger the memory the longer it lasts, if the same number of writes occur.

I expect there is some validity to the argument that Tesla/Nvidia could have designed it better. I'm about 90% sure the chip is part of the Visual Computing Module purchased from Nvidia as a sub-assembly to the MCU. So Tesla may have had no choice in the chip, or the fact it was soldered on the board or the amount of memory. I would agree, it would be a better design to make it easily replaceable, but then there are security risks from hackers too, so that may not be all that great of a solution either.

As for the logs, they do provide valuable information when things go wrong. It's easy to say Tesla shouldn't record some information to extend the life of the part, but then it becomes harder to analyze software issues. There are so many engineering tradeoffs, from part selection, environmental considerations, cost, and longevity. Perhaps in hindsight, they could have picked different objectives, but I can't see a lawsuit working in this case. I wish you luck but worry you'll be spending a lot more effort and costs for no gain.

rxlawdude | 2019年9月18日

When/if my 2015 S70D's MCU fails, you can bet I will be demanding the defective MCU be returned.

I wonder if the replacement is a refurb.

SCCRENDO | 2019年9月18日

My model S with 163000 miles had its MCU replaced because of a bubble in the glass rather than a failure. But after 100000 miles if it needs replacement pay up

rxlawdude | 2019年9月18日

@SCC, sure, you have to pay out of warranty. But you should be able to get the part they pull, if you request it at the time of the service write up.
Unless they consider this a "core" that has to go back. But that would imply that they are installing refurbished MCUs. | 2019年9月18日

@rxlawdude - To get a removed part out of warranty, I think it only applies to California, but perhaps a few other states. As you noted the owner must also ask to get removed parts before the work is done.

justin.cockett | 2019年9月18日

01/2016 85D

My MCU would not restart for hours (black screen) the other day, when it did come back it was reset to factory settings and it was another day before the LTE network came back. A subsequent software upgrade was pushed by Tesla and another after that. The second update hung the MCU, but it recovered after a scroll wheel reset.

IMHO Has all the signs on a pending eMMC failure unless anyone else can suggest anything? I’m on the UK. | 2019年9月19日

I had to think long and hard why the writing of log data to the eMMC flash memory made no sense to me as a cause of failure. I really had to put some numbers to it.

While I don’t know the amount of logging data Tesla has, I do have a Linux server that hosts my website. It is complex, has a ton of plugins, not all perfect (i.e. some bugs written to the logs). This averages 277 bytes/hour. I’d expect the car produces less log data, but let’s say it’s 10 times what I see on my Linux server or 2,700 bytes/hour.

A while back I had an ICE car with the average speed. It was 37 mph, which seemed low for both freeway and city driving. Let’s assume your Tesla is worse, and only averages 30 mph.

The eMMC flash memory is 64 Gbit or 8 GB. It’s hard to know how much space is allocated to the log, but let’s assume 1/10 of the drive is for the log. Most of it will be used to store the software for the processor, likely loaded into RAM at bootup. So 8 GB/10 = 800K bytes for log storage.

It takes 800,000/2700 = 296 hours to write before the log is overwritten.

I don’t know the specs of the flash memory, but good memory should last at least 1500 writes, and cheaper memory about 500 writes. Let’s assume cheaper memory.

296 hours x 500 = 148,000 hours before failure

At an average speed of 30 mph, 30 times 148,000 hours = 4.4 million miles.

So, the writing of logs should not fail for 4.4 million miles!

Also consider if the log writing does fail, it should not affect the car in any way. The logs are not used in the operation of the car, so it really doesn’t matter if that part of the chip is damaged.

Now, there could easily be other things that are written frequently to the flash memory to that cause it to have an early death, but I very much doubt the hacker’s belief that the logs have anything to do with it. You can play with the numbers, but I’m hard-pressed to come up with a scenario where writing log data has any real effect on the MCU over the life of the car.

There is a lot here, and perhaps I may a grievous error somewhere, so I welcome others to look over my analysis in case I screwed it up :)

rxlawdude | 2019年9月19日

The biggest assumption is what's being written to that flash memory. I suspect the writes are much more frequent and sizable.

S75RedRidingHood | 2019年9月19日

@TeslaTap, 8 GB/10 = 800K bytes not! 8 GB/10 = 800Megabytes. Also, Tesla logging is much more than 2700 bytes/hour. If you read one of the NTSB report of Tesla crash, you can see that they were able to recreate the vehicle path by seconds which suggests that there must be lots more data logged every second not hour.
As in any electronic devices, they all will fail eventually so I think we all should be prepared for this to happen in any car. Tesla is a special case since this computer happen to be a vital part of the car and we can not drive the car without one (well one may try but very uncomfortable). | 2019年9月19日

@S75Red - Thanks. I'm only looking at the Linux OS log, not any vehicle information. The hacker saw the Linux logging and concluded that was the cause of the eMMc failure and went so far to disable logging. This seemed like a totally wrong conclusion and I think I came close to proving that.

I was only off by a factor of 1000. The Linux log should not fail for 4,400 million miles :) Dumb error.

I do expect other data being written in higher volume and frequency and could be the cause of eventual eMMC flash memory failures. As to the importance of that data, how much data is being written, it would require far more in-depth analysis that is beyond any currently available information.

jordanrichard | 2019年9月20日

I am still on my original MCU in my March 2014 built MS. I have 165,000 miles which is the equivalent of a touch over 10 years of driving for the average driver. So operationally my MCU has been running for 10 years, no issues like those described.

shmckinley | 2019年10月1日

My 2013 Model S MCU began displaying text with missing characters in various pop-up windows (most noticeably the battery charging configuration window) which left things rather unintelligible. A service center agent explained how the memory in the vehicle is storing every single trip, which over time, consumes much memory space. She had me reset Trip A as well as Trip B. "Clearing" that memory resolved the issue. All items returned to being displayed correctly. Not sure how relevant this experience is to this thread, but I thought it worth sharing in case others encounter this MCU display problem.