This document describes the preferred method to troubleshoot various faults of the L3+ hash board and how to use the test platform to accurately diagnose issues. This document has been translated from the original Chinese and checked as well as possible for technical correctness and proper English grammar. The content of this article comes from our enthusiastic customer Bob, thanking him for his contribution to the majority of miners.
Section 1. Maintenance platform requirements
1. Constant temperature soldering iron (350°-400°), the pointed soldering iron tip is used for soldering small patches such as chip resistors and capacitors.
2. The hot air station is used for SMD work. Be careful not to apply heat for an excessive time to avoid PCB blistering.
3. APW3 power supply (output 12V, 133A Max), used for the test and measurement of the hash board.
5. Flux, absolute alcohol; alcohol is used to clean the board and components both before and after solder work.
6. Tin plating fixture, plating tin steel mesh, solder paste; when replacing a new chip, you must properly tin the chip.
You can also choose Antminer L3+ repair parts and tool kits. The repair bundle includes all the accessories and tools needed to repair L3+, which is convenient and quick.
1. The maintenance personnel must have certain electronic knowledge, more than one year of maintenance experience, and be proficient in QFN package soldering technology.
2. After repairing, the hash board must be tested at least twice before it can be declared as repaired.
3. Pay attention to the instructions when replacing the chip. After replacing any accessories, that there is no obvious deformation of the PCB board. Check the replacement parts and surrounding parts for open-circuit and short-circuit problems paying particular attention to solder balls and bridges.
4. Determine that all test equipment is configured correctly before starting the test work.
Principle and structure:
1. L3+ is contains 12 voltage domains in series, each voltage domain has 6 BM1485 chips, and the whole board has 72 BM1485 chips.
2. The BM1485 chip has a built-in step-down diode, and the one with a step-down diode function is determined by the specified pin of the chip.
3. The L3+ clock is a 25M single crystal oscillator, which is transmitted in series from the first chip to the last chip.
4. There is an independent small heat sink on the back of each chip of L3+. The small heat sink on the back is fixed on the back of the IC with thermal glue after the initial test of the board. After the repair and replacement chip has passed the test, you need to evenly coat the IC surface apply black thermal conductive glue and heat to fix it.
● Analysis of key points:
1. The following figure illustrates the signal path of the L3+ hash board:
The CLK signal is generated by the Y1 25M crystal oscillator, which is transmitted sequentially from chip 1 to chip 72; the voltage is 0.9V during standby and operation.
The TX (CI, CO) signal flows in from pin 11 of the IO port and then transmits sequentially from chip 1 to chip 72; when the IO cable is not inserted, the voltage is 0, during operation, the voltage is 1.8V.
RX (RI, RO) signal flow direction, return from chip 72 to chip 1, and then returns to the control board from pin 12 of the IO port; when the IO cable is not inserted, the voltage is 1.8V, and the voltage is also 1.8V during operation.
BI (BI, BO) signal flow direction, pull low level from chip 1 to 72; when no IO line is inserted, it is 0V when in standby, and it is a pulse signal of about 0.3 during operation.
The RST signal flows in from pin 15 of the IO port and then is transmitted from chip 1 to chip 72; when no IO signal is inserted, it is 0V in standby mode and 1.8V in operation.
2. The following figure shows the key circuits on the front of the L3+ operation board.
1) Test points are between each chip (as shown in the figure after zooming in): Figure 2
When attempting to diagnose the fault in the signal chain between the hash chips, The test points are the easiest way to determine where the signal may be interrupted. The arrangement is: RST, B0, RI (RX), C0 (TX), CLK signal.
2) Voltage domain: The board has 12 voltage domains, and each voltage domain powers 6 chips. The 6 chips in the same voltage domain are powered in parallel, and then connected in series with other voltage domains after being connected in parallel. The circuit structure is shown in Figure 4 on the next page.
Enlarged detail of voltage domain single chip (Figure 5 below):
Figure 6 below shows the detailed pin out of the BM1485 ASIC chip
During maintenance, the main test points are the test points before and after the chip (5 before and 5 after the: CLK, CO, RI, BO, RST); CORE voltage: LDO-1.8V, PLL 0.9V, DC-DC output, and Voltage 14V.
1) When the IO line is not plugged in and only 12V is plugged in: the DC-DC output is about 10V, and the boost output is about 14V. The test point must have CLK 0.9V, RI 1.8V voltage, its test voltages are all 0;
2) When the IO line is plugged in and the test button is not pressed, neither DC- DC nor boost has voltage output. After pressing the tool test button, the PIC starts to work. At this time, the DC-DC output tool is tested.
The voltage is set by the program, and the boost will work with it. Then the tool outputs WORK and returns nonce after calculation. At this time, the normal voltage of each test point should be:
When the tool has just sent WORK, the DC level will be pulled down because of the negative polarity of CO, and the instantaneous voltage is about 1.5V.
RI: 1.6-1.8V. During calculation, if the voltage is abnormal or too low, the calculation board will be abnormal or the hash rate will be 0.
BO: 0V when there is no calculation. During calculation, there will be a pulse jump between 0.1-0.3V.
RST: 1.8V. Each time you press the test button of the tool, the reset signal will be output again.
When the above-mentioned test point status and voltage are abnormal, the fault
location can be determined from measuring the test points
●Signal flow, per the image above:
CLK signal: input to pin 23 of the chip and output from pin 5. When connecting across voltage domains, it is connected from pin 5 to pin 23 of the next chip through a 100NF capacitor connection.
TX signal: input from pin 25 of the chip and output from pin 4;
RX signal: returned by the chip from pin 3 and output on pin 26;
BO signal: input from pin 27 of the chip and output from pin 2;
RST signal: input from pin 28 of the chip, output from pin 1.
Test the signal voltage of each chip, CORE voltage, LDO-1.8OV, PLL-0.9V and other voltages:
CORE: 0.8V---When the voltage is abnormal, it is usually the CORE of the chip in the voltage domain is short-circuited
LDO-1.8O: 1.8V---When the voltage is abnormal, the chip LDO-1.8O is short- circuited or open
PLL-0.9: 0.8V---When this voltage is abnormal, the PLL-09V power supply of a
certain chip in its voltage domain is short-circuited.
1)determining the operating status of the hash board, the computing capability of the chip, and the temperature sensitivity is displayed on the information in the window of the test tool.
3. IO port: IO is composed of 2X9 pitch 2.0 PHSD 90-degree in-line double row. The pin definitions are shown in Figure 8 below:
As shown in FIG:
Pins 1, 2, 9, 10, 13, and 14: GND.
Pins 3 and 4 (SDA, SCL): It is the I2C bus of DC-DC PIC, which connects the communication between the control board and the PIC, and the control board can read and write the PIC through it. Data to control the operating status of the hash board.
Pin 5 (PLUG0): It is the identification signal of the arithmetic board. This signal
pulls a 10K resistance from the arithmetic board to 3.3V, so when the IO signal is plugged in, this pin should be at it’s high level.
6, 7, 8 (A2, A1, A0): PIC address signal.
Pins 11 and 12 (TXD, RXD): are the channels of hash rate at the 3.3 end of the hash board, which become TX (CO) and RX (RI) signals after being divided by resistors.
The pin level of the IO port is 3.3V, and after the voltage is divided by the resistor, it becomes 1.8V.
Pin 15 (RST): It is the 3.3V end of the reset signal, which becomes a 1.8V RST reset signal after being divided by resistors.
Pin 16 (D3V3): Provides 3.3V power supply for the hash board. The 3.3V is provided by the control board, mainly to provide the working voltage for the PIC.
TX_IN voltage is 1.8V
RST_IN voltage is 1.8V
14V boost circuit:
Responsible for boosting DC-DC (10—10.4V) to 14V. The principle is to boost 10V to 14V through U111 RT8537 switching power supply, U111 generates
The switching signal produced by L1 is the energy storage inductor, and D100 is
the boost rectifier diode to charge and discharge C1072, thus obtaining the positive pole of C1072.
14V. As shown in Figure 11 and Figure 12:
Note: An excessive increase of the voltage of the boost circuit can easily cause damage to the LDO of the last four voltage domains of the hash board, and also easily cause damage to the chip.
Most abnormalities are caused by oxidation of U111, R996, and R997. For boost circuit, the voltage output test points are both ends of C1072.
4.DC-PIC: Consists of PIC16(L)F1704. As shown in Figure 13 and Figure 14, this device stores the frequency information and voltage values of the hash board chips.
It also controls the DC-DC output voltage for the hash board.
5.DC-DC circuit: It is composed of LM27402SQ and CMOS tube TPHR9003NL. As shown in Figure 15 and Figure 16 below:
The DC-DC output voltage test point is the two ends of the capacitor C948.
When the DC-DC voltage is abnormal, first check whether the PIC voltage value is the same as the DC-DC output voltage through the printed information of the tool; if not, please change it first.
Change the small capacitor around LM27402SQ; If the DC-DC has no output, please check the EN voltage of R13 and R14 is about 1V, R11 voltage is 12V, whether the PIC works abnormally, and whether the PIC can accept the control board normally
7. 1.8V-LDO is composed of 1.8VLDO SPX5205M5_L_1_8. As shown below:
SPX5205M5 pins 1 and 3 are input, and pin 5 is 1.8V output;
The PLL-0.9V voltage is obtained by dividing the voltage of VDD-1.8 through two resistors.
8. Temperature sensing circuit:
Composed of sensor IC, the temperature sensor chip collects the built-in temperature sensor of BM1485 through pin 6.7 of BM1485, and collects the temperature sensor parameters.
Finally, it passes through the 15th and 16th pins of BM1485, and returns to the
FPGA of the control board from RI. The principle is shown in Figure 21:
General Repair Flow.
1. Routine inspection: First of all, visually inspect the hash board to be repaired to observe whether there is any displacement, deformation, or scorching of the small heat sink? If any, it must be processed first; if the small heat sink is displaced, after dismantling, wash off the original glue and re-adhesive after the repair is passed.
If visual inspection shows no problem, the impedance of each voltage domain can be tested first to detect whether there is a short circuit or an open circuit. If found, it must be dealt with first. Check whether the voltage of each voltage domain reaches 0.8V, and the voltage difference of each voltage domain shall not exceed 0.05V. If the voltage of a specific voltage domain is too high or too low, the circuits in the adjacent voltage domain are generally abnormal. We need to investigate the reason first.
2. After the routine test shows no problem (the short-circuit test of the general routine test is necessary so as not to burn the chip or other materials due to the short circuit when the power is turned on), the test fixture can be used for the chip test, and according to test the results of the test fixture for judgment and positioning.
3. According to the test result displayed by the test box, starting near the faulty chip, check the chip test points (CLK IN OUT/TX IN OUT/RX IN OUT/B IN OUT/RST IN-OUT) and VDD, VDD0V8, VDD1V8, and other voltages.
4. According to the signal flow, except for the RX signal, the signal is transmitted in the reverse direction (72 to chip 1), and several of the signals CLK CO BO RST are transmitted in the forward direction (1-72), and abnormal faults can be found through the power supply sequence point.
5. When locating the faulty chip, the chip needs to be melt-soldered again. The method is to add flux around the chip (preferably no-clean flux), heat the solder joints of the chip pins to a dissolved state, and then gently move down and left to press the chip; to prompt the chip pins and pads to re-engage and close the tin to achieve the effect of tinning again. If the fault remains the same after re-soldering, you can directly replace the chip.
6. For the repaired hash board, the test fixture must be tested more than twice. Two test times before and after: the first time, after the replacement of parts is completed, the calculation board needs to cool down and pass the test. After passing, put it aside. For the second time, wait a few minutes for the arithmetic board to cool down before proceeding with the test. Although the time for the two tests is a few minutes, this does not affect the work. It will be fixed. Set the board aside, continue to repair the second board, wait for the second board to be repaired and set it aside to cool down, and then test the first board. In this way, the time is just staggered, and the total time is not delayed.
7. The repaired board. It is necessary to classify the faults and record the replacement components' type, location, reason, etc. Also, to prepare for feedback back to production, after-sales, and R&D.
8. After recording, install it into a complete miner for formal aging.
Five types of failure: Common fault types detected by the test fixture are:
1. Imbalanced impedance among multiple voltage domains
when the impedance of some voltage domains deviates from the normal value, it indicates that there are parts in the abnormal voltage domain that have open circuits or short circuits. It the most likely to be caused by the general chip. But there are three chips in each voltage domain, and often only one has a problem when it fails. The method of finding out the problem chip can detect and compare the test point to the ground impedance of each chip to find anomalies point.
If you encounter a short circuit, remove the heat sink on the chip with the same voltage first, and then observe whether the chip pins are connected to the solder. If the short-circuit point cannot be found in the appearance, the short-circuit point can be found according to the resistance method or the current interception method.
2. Lack of chips
Lack of chips means that the test box cannot detect all 72 chips during the test, and often only the actual number of chips can be detected and the actual missing (undetectable).
The abnormal chip is not in the displayed position. At this time, it is necessary to accurately locate the abnormal chip through testing.
The location method can use TX cut-off to send out the way to find the location of the abnormal chip. It is to connect the TX signal of a chip to the ground, for example: output the TX signal of the 50th chip to the ground of the voltage domain. After that, theoretically, if all the previous chips are normal, 50 chips should be detected in the test box. If 50 chips are not detected, the abnormality is before the 50th chip; If 50 chips are detected, it means that the abnormal chip is after the 50th chip. By analogy, use dichotomy to find the location of the abnormal chip.
3. Broken link
A broken link is similar to a lack of a chip, but a broken link does not mean that all chips that cannot be found are abnormal, but because a certain chip is abnormal, all chips behind the abnormal chip fail. For example, a chip failure.
The majority can work, but it will not forward other chip information; at this time, the entire signal chain will abruptly stop here, and a large part of it will be lost, which is a broken link.
The broken link can be displayed in the general test box. For example, when the test box detects the chip, only 14 chips are detected. If the preset number of chips is not detected in the test box.
It won’t work, so it only shows how many chips are detected. At this time, just check the voltage and impedance of each test point before and after the 14th chip according to the displayed number “14” to find the problem.
4. Not running
Not running means that the test box cannot detect the chip information of the hash board, but displays NO hash board; this phenomenon is the most common and involves a wide range of faults as in the list below:
1) Non-operation caused by abnormal voltage in a certain voltage domain; the problem can be found by measuring the voltage of each voltage domain.
2) The abnormality caused by a certain chip abnormality can be found by measuring the signal of each test point.
CLK signal: 0.9V; the signal is output from chip 1 to chip 72, but the current version has only one crystal oscillator. As long as there is an abnormal signal LCK, all subsequent signals will be abnormal. Search in order according to the signal transmission direction.
TX signal: 1.8V; this signal is generated by chip 1, 01,,,,,, 72. When a certain point of the dichotomy is abnormal, it can be detected forward.
RX signal: 1.8V; this signal is returned by No. 72,,,,,, 1, and the cause of the fault is confirmed through the chip signal direction. The S7 and S9 hash boards are not running this signal is the highest priority, and searching for this signal is the priority.
BO signal: 0V, this signal can be pulled to a high level when the chip detects that the Ri return signal is normal, otherwise it is low.
RST signal: 1.8V; after the hash board is powered on and the IO signal is plugged in, this signal will be transmitted from the direction of 01, 02,,,,,,72 to the last chip.
3) Caused by VDD of a certain chip.
It is possible to measure whether the potential difference of each voltage domain is normal. Under normal circumstances, when the VDD voltage is 0.8V, the normal voltage of each test point in other voltage domains is also 0.8V, so that each voltage can be guaranteed.
4) The VDD1.8 voltage of a certain chip is abnormal.
Determine whether a certain VDD1.8 voltage is normal by measuring the test points of each voltage. In general, the IO voltage determines the voltage of each test point. When the IO voltage is 1.8V, the test points of other voltage domains are normal.
5) Caused by abnormalities in the step-down circuit and the step-up circuit.
You can directly measure whether the voltage at both ends of the C948 capacitor output at the upper left corner of the hash board is between 10 and 10.4V. If it does not or exceeds it, you need to upgrade the U3 PIC again; after confirming that the PIC voltage is normal, check to measure whether the U111 boost circuit has output 14V voltage, there are no peripheral parts and U111 itself.
5. Low hashing
Low hash rate can be divided into:
1) When the test fixture is tested, the Nonce received by the box is not enough, and the hash rate is insufficient and it displays NG. This phenomenon can be seen directly through the serial port print information of the test fixture to see the return of each chip the number of nonces is judged. Generally, the chip with the returned nonce number lower than the set value should be trouble-checked. If the non-virtual soldering and external causes are excluded, the chip can be replaced directly.
2) During the test of the test fixture, the hash rate is low after installing the whole miner. Most of this situation is related to the heat dissipation conditions of the chip, and special attention should be paid to the glue used for the small heat sink of each chip, and the whole ventilation performance of the miner. Another reason is that the voltage of a certain chip is critical. After the whole miner is installed, the difference between the 12V power supply and the power supply during the test results in a deviation between the test calculation power and the running calculation power. After turning down, use the test box to test, slightly adjust the voltage of the 12V output of the DC adjustable power supply, and then perform the test to find out the voltage domain with the lowest number of the returned nonce.
6. A certain chip is NG
It means that when the test fixture passes the test, the serial port information of the test fixture shows that the returned nonce of a certain chip is insufficient or zero. In addition to eliminating the problem of virtual soldering and peripheral components, the chip can be directly replaced.
1. During maintenance, the maintenance personnel must be familiar with each test point's function and flow direction, the typical voltage value, and the ground impedance value.
2. You must be familiar with chip soldering to avoid blistering and deformation of the PCB or damage to the pins.
3. BM1485 chip package, 14 pins on both sides of the chip. The polarity and coordinates must be aligned during welding, and they must not be misaligned.
4. When replacing the chip, the thermally conductive adhesive must be cleaned to prevent the IC from hanging in the air or providing poor heat. Dissipation during soldering, resulting in secondary damage to the chip.