View on GitHub

Computer Architecture and Operating Systems

Course taught at Faculty of Computer Science of Higher School of Economics

Lecture 7

Floating-Point Format

Lecture

Slides (PDF, PPTX).

Outline

Floating-point format.
Standard IEEE 754.
Floating-point instructions.
Programs with floating-point operations.

Examples

Workshop

Find decimal values for the following binary values:

Find binary values for the following fractions:
```
1/2
1/8
3/4
5/16
11/32
```
Find binary values for the following decimal values:
```
0.5
0.25
0.125
1.125
5.875
3.1875
```
Write program fprint.s that inputs a single and double floating-point value and prints them in the binary format.
Write program fprint2.s that separately prints fields (sign, fraction, exponent) of single and double floating-point values. The code of the previous program can be partially reused.
Write program farithm.s that inputs three double values a, b, and c, calculates the result of expression a + b - c, and prints the result.
Write program even_back.s that does the following:

Input an integer value N and then N float values. Output line by line only even ones, in reversed order. To decide whether a float number is even, it must be converted (rounded) to an integer value.

Input:
```
6
12.3
-11.0
3.25
88.01
0.0
1.25
```
Output:
```
0.0
88.01
12.3
```
Write program no_dups.s that does the following:

Inputs an integer N value and then N double values. Outputs all the doubles, skipping duplicated ones.

Input:
```
8
12.025
34.5
-12.0
23.25
12.025
-12.0
56.75
9.125
```
Output:
```
12.025
34.5
-12.0
23.25
56.75
9.125
```

Homework

Write program fraction_truncate.s that does the following:

Input three cardinals — A, B and n. Output double float F that has exact n decimal places of A/B. You need to write a subroutine than accepts double f=A/B in fa0 and integer n in a0 and returns rounded double F in fa0.

Hint: \(10^n*A/B < 2^{31}\)

Input:
```
123
456
7
```
Output:
```
0.2697368
```
Spoiler: \(10^n*A/B < 2^{31}\) means that you can just take an integer part of it, then divide the result back to \(10^n\)
Write program cubic_root.s that does the following:

Input double (positive or negative) values \(1 <= |A| <= 1000000\) and \(0.00001<= ɛ <=0.01\). Calculate a cubical root of A with closeness \(<=ɛ\) (you do not need to round the result).

HINT: You always can calculate a cubic power of something!

Input:
```
1000
0.0001
```
Output:
```
9.99995
```
Spoiler: suppose solution is between M and N (M < N). Select \(K=(M+N)/2\) and if \(|K^3|>|A|\) then solution is between M and K, else it is between K and N.
Bonus task (2 bonus points). Write program leibpi.s that does the following:

Calculate π value using Leibniz formula for π accurate to N decimal places. Input N, output the result. Use function defined in FractionTruncate to truncate out other digits. Keep in mind that the exact formula is calculating π/4, you probably should start with 4 instead 1 to gain exact accuracy. Warning: the algorithm is slow, do not panic, but keep code as simple as possible.

Input:
```
4
```
Output:
```
3.1416
```
Hint: to gain performance, keep anything in registers.

References

Standard IEEE 754 (Wikipedia).
Standard IEEE 754-2008.
Floating point. Section 3.5 in [CODR].
Floating point. Section 2.4 in [CSPP].
RISC-V Assembly Programmer’s Manual.
RISC-V Formal Specifications in nML.