I. Maintenance Platform / Tool / Equipment Preparation Requirements
1. Platform requirements:
2. Equipment requirements:
Constant temperature soldering iron (350℃ - 380℃) and pointed soldering iron tip are used for soldering small patches such as chip resistors and capacitors; portable desoldering gun and BGA rework station are used for chip/BGA disassembly and soldering; multimeter with soldering steel pin and heat-shrinkable T bush is used for easy measurement (Fluke 15B+ is recommended); Oscilloscope (FNIRSI recommended), network cable (requirements: Internet connection, stable network)
3. Requirements for test tools:
APW12 power supply (APW12_12V-15V_V1.2 and power adapter cable (self-made: use thick copper wires to connect the power supply at positive and negative poles and the hash board. It is recommended to use 4AWG copper wires with a length of 60cm or less) for the hash board.
Use the test fixture of the V2.2010 control board. The positive and negative poles of the test fixture need to be installed with discharge resistors. It is recommended to use a cement resistor of 25 ohms and more than 100W.
4. Maintenance auxiliary materials / tool requirements:
Solder paste column M705, flux, circuit board cleaning solution with absolute alcohol; board washing water is used to clean up the solder residue after repair; the thermal gel is used to smear the surface of the chip after repair；
Tin tool steel mesh, tin removal wire, solder ball (the ball diameter is recommended to be 0.4mm); when replacing a new chip, you need to tin the chip pins and then solder to the hash board, and then lock large heat sinks after applying thermal gel evenly on the chip surface.
1) Barcode scanning gun
2) Port adapter board RS232/TTL adapter board 3.3V
3) Self-made short-circuit probe (use pin wire for welding, require the heat-shrinkable T bush, prevent a short circuit between the probe and small heat sink)
5. Common maintenance spare material requirements:
0402 resistances (0R, 51R, 10K, 4.7K,); 0402 capacitors (0.1uf, 1uf)
II. Maintenance Requirements
1. Pay attention to the operation method when replacing the chip. After replacing any accessories, the PCB board shall have no obvious deformation. Check the replacement and surrounding parts for open circuit and short circuit issues.
2. The maintenance operators must have specific electronic knowledge, more than one year of maintenance experience, and be proficient in BGA/QFN/LGA package soldering technology.
3. After repairing, the hash board must be tested more than two times to be OK before it can pass!
4. Check whether the tools and hash board testers can normally work, determine the maintenance station to test software parameters, test fixture versions, etc.
5. In repairing and replacing the chip, the chip needs to be tested first, and then the function test shall be performed after passing.
The function test must ensure that the small heat sink is welded qualified. When installing the large heat sink, the surface of the chip must be evenly coated with thermal gel, and the cooling fan shall be at full speed. When using the chassis to dissipate heat, 2 hash boards should be placed simultaneously to form an air duct.
6. When measuring the signal, use 4 fans to assist heat dissipation, and the fans shall maintain full speed.
7. When powering on the hash board, the user must first connect the negative copper wire of the power supply, then the positive copper wire of the power supply, and finally plug in the signal cable. When removing, the order of installation must be reversed. First, remove the signal cable, then remove the positive copper wire of the power supply, and finally remove the negative copper wire of the power supply. If the user does not follow this order, it is very easy to cause damage to R89, R90, U2, and U4 (not all chips can be found). Also, before testing the pattern, the repaired hash board must be cooled down; otherwise, it may cause PNG testing.
8. To replace a new chip, printing pins and soldering paste are required to ensure the chip is pre-soldered and soldered to the PCBA for repair.
III. Hash Board Tester Making and Matters Needing Attention
The supporting fixture of the hash board tester should satisfy the heat dissipation of the hash board and facilitate the measurement of signals.
1. Use the 19 series hash board tester SD card swiping program for the first time to update the hash board tester control board FPGA, unzip it and copy it to the SD card, and then insert the card into the hash board tester card slot; power on for about 1 minute and wait for the control board indicator to double flash 3 times, then the update is completed; (if it is not updated, it may cause a certain chip to be bad during the test)
2. Make the test SD card according to the requirements, and directly unzip the compressed package of the single-sided heat sink inspection chip to make the SD card;
3. The test SD card will be made according to the requirements, and the double-sided heat sink 8-times Patter test needs to make an SD card, as shown in the figure below;
1) Delete the original config file after unzipping;
2) Name the original Config.ini-NBS1902-PT2 file as Config.ini;
Figure 3-3 Figure 3-4
IV. Principle Overview
1. Working structure of S19+ hash board:
The hash board is composed of 80 BM1398AC chips, which are divided into 10 domains, and each domain is composed of 8 ICs (as shown in Figure 4-1 and Figure 4-2); the operating voltage of the BM1398AC chip used in the S19+ hash board is 1.34V-1.4V; for the tenth domain, there are two groups of LDO for each 1.8V and 0.8V. 1.8V LDO is powered by the 14V output from the VDD_14V and outputs 1.8V. The 1.8V output of this domain provides 0.8V via LDO. The ninth domain - the first domain, LDO is powered by the VDD of the following domain and outputs 1.8V. There are three groups of 0.8V LDO for domains 1-9. The VDD of the next domain provides 0.8V via LDO.0.8V-1 is for the first and second ASIC of one domain; 0.8V-2 is for the third to fifth ASIC of one domain; 0.8V-3 is for the sixth to eighth ASIC of one domain. The voltage of each domain retreated is reduced by 1.38V (as shown in Figure 4-3 and Figure 4-4).
Figure 4 -1
2. S19+ test point and Temperature IC:
Test points on top side:
Test points on bottom side:
Temperature IC on bottom side:
3. Signal trend of S19+ chip:
1) CLK (XIN) signal flow direction, generated by Y1 25M oscillator, transmitting from chip 01 to chip 80; voltage of 0.7V-1.3V;
2) TX (CI, CO) signal flow direction, from IO port seven pins (3.3V) into IC U4 through level conversion, and then transmitted from chip 01 to chip 80; the voltage is 0V when the IO signal is not inserted, and the voltage is 1.8V during operation;
3) RX (RI, RO) signal flow direction, from chip 80 to chip 01, return to the signal cable terminal pin 8 through U2 and then return to the control board; when the IO signal is not inserted, the voltage is 0.3V, and the voltage will be 1.8V during computing;
4) BO (BI, BO) signal flow direction, from chip 01 to chip 80; the Fluke 15B+ multimeter measurement value is 0V;
5) The RST signal flow is from pin 3 of the IO port and then is transmitted from chip 01 to chip 80; if no IO signal is inserted and equipment is on standby, voltage is 0V, 1.8V when computing;
4. Whole miner architecture:
The whole miner is mainly composed of 3 hash boards, 1 control board, APW12 power supply, and 4 cooling fans, as shown in Figure 4-5.
V. Common Faults and Troubleshooting of the Hash Board
Phenomenon 1: single hash board test detection chip is 0 (PT1 / PT2 stations)
First step: to check the voltage output of VDD_14V and the voltage domain
The voltage of each voltage domain is about 1.34V-1.4V. If there is a 14V power supply, generally, it has domain voltage. If there is no VDD_14V, please check the output of PSU; If normal, then check the PIC circuit (Follow the below steps to check the PIC circuit). If 14V has a power supply but no domain voltage, continue to check.
To check the PIC circuit:
Measure whether there is output on the eleventh pin of U4 and the voltage is about 3.3V; if yes, please continue to troubleshoot the problem; if there is no 3.3V, please check the connection status of the hash board tester cable and the hash board is OK, and reprogram the PIC.
Figure 5-2Figure 5-3
PIC programming steps:
1. PIC program programming on the hash board.
Download the programming tool: PICkit3; pin 1 of the PICkit3 cable corresponds to pin 1 of J2 on the PCB, and pins 1, 2, 3, 4, 5, and 6 need to be connected.
2. Programming software:
Open MPLABIPE and select device: PIC16F1704, click power to select the power supply mode, then click operate. The first step: select the file to find the.HEX file to be programmed; the second step is to click to connect normally; the third step is to click the program button. After completion, click verify to prompt the verification to prove that the programming is successful.
Second step: check the boost circuit output
D1 in the test Figure 5-7 can measure 14V voltage.
Third step: check the output of each group of LDO 1.8V or PLL 0.8V
Fourth step: check the chip signal output (CLK / CI / RI / BO / RST)
1. Refer to the voltage value range described by the signal trend. If the measurement encounters a significant deviation of the voltage value, it can be compared with the measured value of the adjacent group to determine.
PS: If the hash board is not powered or powered off according to the test sequence, causing R89, R90, U2, and U4 to burn out, the chip will report 0;
2. When the EEPROM NG is displayed on the LCD screen of the hash board tester, check whether the welding of U10 is normal;
3. If the PIC sensor NG is displayed on the LCD screen of the hash board tester and the test read temperature is abnormal, then follow the steps below to troubleshoot:
1) Check whether the 4 resistors of R71~R77 are welding abnormally, and check whether the welding of PIN2, 3 of U4 is standard;
2) Check whether the four temperature sensors U7, R78, R80, R81; U8, R83, R84, R88; U9, R92, R94, R95; U11, R96~R98, and the matching resistance welding are abnormal, the location of the temperature sensor is shown in Figure 5-8, the temperature sensor is all located on the back of the PCB, the resistance is located on the front and back of the PCB, and whether the temperature-sensitive 3.3V power supply is normal; Check the welding quality of the heat-sensitive chip and the small heat sink. The deformation of the large heat sink material will cause poor heat dissipation of the chip and affect the temperature difference.
Phenomenon 2: Single hash board detection chip is not complete (PT1 / PT2 stations)
1. LCD display ASICNG: if (0), first measure the total voltage of the measuring domain and the boost circuit 14V is normal, and then use the short-circuit probe to short-circuit the RO test point and the 1V8 test point between the first and the second chip, and then operate the program to find the chip. Looking at the serial port log, if a 0 chip is still found at this time, it will be one of the following situations:
1) Use a multimeter to measure whether the voltages at the 1V8 and 0V8 test points are 1.8V or 0.8V. If not, it indicates that the 1.8V or 0.8V LDO circuit of this domain is abnormal, or the two ASIC chips of this domain are not soldered well; most of these are caused by short circuits of 0.8V, 1.8V patch filter capacitors (measure the resistance of the patch filter capacitors related to the front and back of the PCBA).
2) Check whether the circuits of U2 and U4 are abnormal, such as resistance welding, etc.
3) Measure the resistance of R89 or R90 with a Fluke 15B+ multimeter to check if it is within 10 ohms and the reading will not jump randomly. If not, please replace these two resistors.
4) Check if the pins of the first chip are not soldered well (it was found in repair that the pins are tinned observing from the side, but the pins are not stained with tin at all when the chip is removed).
2. If one chip can be found in step 1), it indicates that the first chip and the previous circuit are good. Use a similar method to check the subsequent chips. For example, short-circuit the 1V8 and RO test points between the 38th and 39th chips. If the log can find 38 chips, the first 38 chips have no problem; if you still find 0 chips, check the 1V8 first; if it's normal, it means that there is a problem with the chip after 38. Continue to investigate with dichotomy until the problematic chip is found. For example, assuming that there is a problem with the Nth chip, when the 1V8 and RO between the N-1th and Nth chips are short-circuited, N-1 chips can be found, but when the 1V8 and RO between the Nth and N+1th chips are short-circuited, the entire chip cannot be found.
3. LCD display ASIC75: (Reporting 75), means that the hash board can detect 76 chips at 115200 baud rate, but only 75 chips are found at 12M baud rate, and one chip could not find at 12M baud rate;
Repair method: Using the dichotomy method, short-circuit the 1V8 test point and the RO test point between the 38th and 39th chips through the short-circuit probe. If the log can find 38 chips, there is no problem with the first 38 chips; if short-circuiting 47 chips, but the log reports 46, it indicates that the 47th chip cannot be detected, and there is no problem with the visual inspection. Generally, the 47th chip shall be replaced;
4. LCD display ASICNG: (X, a certain chip is fixed), there are two situations:
1) The first case: the test time is basically the same as the good board (usually, the value of X will not change each time you test) (test time refers to the time from when the start test button is pressed to the result of ASICNG: (X) displayed on the LCD). This situation is likely caused by the abnormal resistance welding of the front and rear CLK, CI, and BO of the Xth chip, so users shall focus on these 6 resistors.
The small probability is due to X-1, X, X+1, that is, among the three chips, the following pins abnormal welding conditions of the chip occur:
2) The second case: the test time is almost twice as long as the good board (sometimes the value of X will change every time you test, and sometimes X=0); at this time, the log usually has the following information (the red number is not 13, depending on which seat the hash board tester is connected to); during the test, assume that the domain voltage of all the fields in front of the abnormal position is almost less than 0.3V, and the domain voltage of the back fields are almost all higher than 0.38V. This situation is caused by the chip not being soldered well; usually, 1.8V, 0.8V, RXT, and CLK are not soldered well. It is recommended to directly measure the domain voltage to locate which domain is the problem. The 1V8 and RO short-circuit methods used in section 1) can also locate the abnormal position;
Phenomenon 3: Single hash board Pattern NG, indicating that the response nonce data is incomplete (PT2 station)
Pattern NG is caused by the significant difference between the characteristics of the chip and other chips. At present, it is found that the chip die is damaged, so just replace the chip. According to the log information, the replacement rules are as follows:
If the appearance of the chip is not damaged, just replace the chip with the lowest response rate in each domain. For example, the following Figure shows one of the test logs, and it can be seen from the log that the response rate of four chips asic     is low. This is because 36 and 37 are in the same domain, so replace the one with the lower nonce in 36 and 37. At the same time, replace the 43 and 75.
PS: Special attention shall be paid to the domain numbers, and asic starts from 0.
Phenomenon 4: Check that the chip test is OK, PT2 function test serial port does not stop (long-distance running)
Repair method: during the PT2 test, watch the serial port print log. When the serial port starts to operate for a long time, use a short-circuit probe to short-circuit RO&1.8V. The short-circuit starts from the first chip. If the serial port stops long-term operating after the short circuit, the first chip is OK. According to this method, find the chip that still has the long-term operating failure after a certain chip is short-circuited.
Generally, it is caused by a certain chip damaging, so just replace it;
Phenomenon 5: PT1 chip test is OK, PT2 function test always reports a certain chip NG;
Repair method: check the appearance, measure the chip capacitor or resistance in front, usually it's caused by poor chip soldering or a chip capacitor, resistor damaging or abnormal resistance;
VI. Control Board Problem Causing the Following Problems
1. The whole miner does not operate
1) Check whether the voltages at several voltage output points are normal. For example, U8 can be disconnected first if 3.3V is short-circuited. If it is still short-circuited, the CPU can be unplugged for measurement. For other voltage abnormalities, generally replace the corresponding converter IC.
2) If the voltage is normal, please check the welding status of the DDR/CPU;
3) Try to update the flash program with an SD card;
a. After the card recovery is successful, the green LED indicator will be always on, and the power shall be turned off and restarted;
b. Wait for the 30s after powering on again (the time course of turning on OTP)
c. OTP (One Time Programmable) is a memory type of MCU, which means one-time programmable: after the program is programmed into the IC, it cannot be changed and cleared again;
1) Sudden power failure during OTP or time of less than 30s will cause the control board to fail to open the OTP function. As for the issue that the control board cannot start (not networked), the user needs to replace the U1 (main control IC FBGA of the control board). U1 can no longer be used in the 19 series after replacement.
2) For the control board with the OTP function turned on, U1 cannot be used on other series of models;
2. The whole miner cannot find the IP
Probably, the IP cannot be found due to abnormal operation. Refer to the first point for troubleshooting.
Check the appearance and welding of the network port, network transformer T1, and CPU.
3. The whole miner cannot be upgraded
Check the appearance and welding of the network port, network transformer T1, and CPU.
4. The whole miner fails to read the hash board or has the fewer hash board
1) Check the cable connection status.
2) Check the parts of the control board corresponding to the chain.
3) Check the wave soldering quality of the plug-in pins and the resistance around the plug-in interface.
VII. Failure Phenomenon of the Whole Miner
1. Whole miner test
Common phenomena: IP cannot be detected, the number of fans is abnormal, and the chain is abnormal. If the test is abnormal, follow the monitoring interface and test LOG prompts for maintenance.
1) Fan display is abnormal: we need to check whether the fan is working normally, whether the connection with the control board is normal, and whether the control board is abnormal.
2) Less chain: Less chain refers to that among the 3 hash boards, 1 piece is missing. In most cases, there is a problem with the connection between the hash board and the control board. First, check the cable to see if there is an open circuit. If the connection is OK, the user can test the single board PT2 to check if it can pass. If it passes the test, it can basically be determined that the problem is on the control board. If the test fails, use the PT2 repair method to repair it.
3) Abnormal temperature: Generally, it's due to the temperature being high. The PCB temperature set by our monitoring system cannot exceed 90 degrees. If it exceeds 90 degrees, the miner will alarm and fail to work normally. So it is usually caused by high ambient temperature and abnormal fan operation for the miner to be unable to work normally. Abnormal fan operation will also cause abnormal temperature.
4) Cannot find all the chips (boot can be operated, but the hash rate is 2/3 or 1/3 of the normal value). So the number of chips is not enough: if the number is not enough, you can refer to PT2 for testing and repair.
5) After operating for a while, there is no hash rate, and the connection to the mining pool is interrupted; then check the network;
6) Test status of normal miner;
7) One hash board has low hash rate: As for this situation, you can log in to the IP through the Putty software to observe whether the domain is working voltage of this board and the NONCE return are normal. Then, you can repair it according to the Putty LOG prompt.
8) How to use Putty? The specific operations are as follows:
a. Open Putty, enter the IP of the miner in question and click OPEN.
b. Enter the user name, password, and test command to check the NONCE response status and the status of the voltage domain. If the NONCE and domain voltage are abnormal, the user can perform measurement and maintenance based on the printed abnormal chip.
VIII. Other Matters Needing Attention
● Routine inspection: First, visually inspect the hash board to be repaired, and observe whether there is PCB deformation or scorching. If yes, it must be processed first; whether there are apparent burnt marks on the parts, offset parts or missing parts, etc.; secondly, after the visual inspection is passed, the impedance of each voltage domain can be tested first to detect whether there is a short circuit or an open circuit. If found, it must be dealt with first. Furthermore, check whether the voltage of each domain is about 0.36V.
● After the routine test is passed (the short-circuit test of the general routine test is necessary to avoid the chip or other materials being burnt due to the short circuit when the power is turned on), the chip test can be performed with the hash board tester. The positioning can be determined according to the test result of the hash board tester.
● According to the displayed results of the hash board tester detection, from the near faulty chip, check the chip test points (CO / NRST / RO / XIN / BI) and voltages such as VDD0V8 and VDD1V8.
● According to the signal flow, except for the RX signal, which reversely transmits the signal (from No.76 to No.1 chip), several of the signals, including CLK CO BO RST are forward transmission (1-76), and the abnormal fault point can be found through the power supply sequence.
● When locating the faulty chip, the chip needs to be welded again. The method is to add flux around the chip (preferably no-clean flux) and heats the solder joints of the chip pins to a dissolved state to prompt the chip pins, and pads to re-run in and collect the tin to achieve the effect of tinning again. You can directly replace the chip if the fault remains the same after re-soldering.
● When testing with the hash board tester, the repaired hash board can be judged as a good product with more than two passes. For the first time, after replacing the parts, please wait for the computing board to cool down, use the hash board tester to test, and after the test is passed, set it aside and then cool it down. Then, for the second time, after a few minutes, test again when the hash board cools down.