Von Neumann + Turing Machine
How does electricity represent the state in order to be stable?
When the computer began to design, it was not about simplicity, but about the reliability of the tasks and results that can be completed automatically.
Simplicity is always based on a stable and reliable foundation
After trying the decimal system, it is difficult to check the current state difference and it is difficult to stabilize the state. The most stable check is
There are two states of energized and non-energized states, then it is defined as 1 when energized, and 0 when not energized, and the state logic of 1 and 0
So How 0 and 1 represent numbers and characters it?
First find out the characters that need to be represented. There are only more than 100 English characters and numeric characters, and 7 binary digits are needed.
It can be all expressed, but for scalability, one extra bit is added to indicate expansion, which is the ASCII code
Because a character only needs up to 8 binary bits to represent, so 8 bytes are specified as the storage unit, all
8 Bit = 1 Byte
The provisions of the characters represented by numbers, numbers expressed in binary, that is, the characters -> Digital -> binary ,
Then the text information can be stored as binary by the computer, and the binary number stored on the computer can be reversed
The relationship conversion between decimal to binary is fixed, so the conversion between characters and numbers is called
Character encoding, ASCII code Unicode UTF-8 is to store the mapping relationship between characters and numbers
Figure out a few relationships
1. The relationship between characters and numbers is a mapping relationship, which is an artificial standard
This kind of mapping relationship is common in life, such as
a. ID card information and ID number
b. The database id and the row information
c. Order information and order number
d. Employee ID and employee
e. Dictionary keys and values
f. Memory address and the value stored at that address
2. The relationship between numbers and binary, this is like the laws of mathematics or physics, fixed conversion method, hard to write
3. The octal hexadecimal system is based on the binary system, and there is no direct relationship with the decimal system, mainly for
Readability, two representations in binary
For example, binary 00000000 is a storage unit, octal 000 000 000 is converted every 3 binary digits
Transposed decimal representation, the minimum number is 0 and the maximum number is 7, so the value range is 0-7
Hexadecimal 0000 0000 Every 4 binary digits are converted to decimal representation, the minimum digit is 0 and the maximum is 15.
All values are in the range of 0-15, because it is beyond the 10 mechanism to represent the range, so use abcdef to represent 10 11
12 13 14 15
Hexadecimal is often used for memory address to represent IPv6 address color table mac address binary data/x prefix b/B
IP address (32-bit dotted decimal system) xxxx Each x is a decimal number represented by 8 bits
The octal hexadecimal system is based on the binary system
Py base conversion function
Decimal to other bases
Convert to binary bin prefix 0b
To hexadecimal hex prefix 0x
Convert to octal oct prefix 0o
The binary octal hexadecimal system is the prefixed string form "0b/o/x..."
# 10 number = 9999 print("10 ".ljust(40, "*")) # 10 2 b_number = bin(number) print(" :", b_number) # 10 8 o_number = oct(number) print(" :", o_number) # 10 16 h_number = hex(number) print(" :", h_number)
Convert other bases to base 10 int(..., base) base specifies the base
# 10 number = 9999 print("10 ".ljust(40, "*")) # 10 2 b_number = bin(number) print(" :", b_number) # 10 8 o_number = oct(number) print(" :", o_number) # 10 16 h_number = hex(number) print(" :", h_number) # 10 # 2 10 num_b = int(b_number, base=2) print(num_b) # 8 10 num_o = int(o_number, base=8) print(num_o) # 8 16 num_h = int(h_number, base=16) print(num_h)
String to binary string
Need to specify the character encoding, the result is prefixed with b/B"..."
# song = " " byte_song = song.encode(encoding="utf-8") print(byte_song) # eq_byte_song = bytes(song, encoding="utf-8") print(eq_byte_song) print(byte_song == eq_byte_song)
Binary to string
Need to specify character encoding
# song = " " # byte_song = song.encode(encoding="utf-8") print(byte_song) # print(" ".rjust(40, "_")) dec_song = byte_song.decode(encoding="utf-8") print(dec_song) # ' str_song = str(byte_song, encoding="utf-8") print(str_song) print(dec_song == str_song)
Convert from decimal to 2 8 to hexadecimal, and take the remainder after division
Converting other bases to decimal is to add the specified power of the base from right to left and then sum
The conversion method is like a formula law, fixed
Divided into signed and unsigned types, generally 8 16 32 64 Bit represents an integer or floating point number
Signed highest bit means the sign, which is the leftmost bit, 0 means positive, 1 means negative number, positive and negative subscript bits 0 and 1
Signed bit represents the range, because it is divided into two halves, half means positive and half means negative.
To put it bluntly, it is to remove one bit representing the sign bit -2**(n-1)-2**n(n-1) -1, n = 8/16/32/64
Unsigned bit means 0 to 2**n -1
The length is different, divided into 1/2/4/8 bytes
Py characters correspond to ASCII number functions
Language---> Number---> 0 1 Binary
This mapping table is called character encoding
The problem solved by character encoding is the mapping relationship between characters and decimal, which is artificially defined
Chinese gb2312 -> GBK Chinese 2 bytes, English 1 byte
International Unicode (2-4 bytes) -> UTF-8 (1-4 bytes)
1. Support global language characters
2. Contains global character encoding mapping
Languages of various countries in the world can be converted to Unicode, and Unicode can be converted to languages of various countries in the world
3. Global software/hardware support Unicode
Because Unicode means that a character requires at least 2 bytes, so the original ASCII only requires one byte.
Now that Unicode encoding is used, the storage space required for storage and network transmission is directly doubled, which is unacceptable
In order to solve this problem, UTF-8 has embarked on the stage of history. Well, network transmission and storage use
UTF-8, the operating system supports Unicode, so efficient transmission, storage and support of global language systems become possible
Coding in Python
First of all, what is the sacred coding in Python?
Let s look at the files that store the code and the files that the code is loaded into the memory and then processed by the interpreter
The code we type is actually text data in essence
Text data should be converted into binary through a certain encoding table and then stored on the hard disk
Binary data stored on the computer also needs a coding table to be converted into text data
What is coding in Python?
The default file encoding in Py3 is UTF-8, and when we edit files through the editor, there will also be a default encoding
Generally, the default is UTF-8. If the text data in the defined file is not encoded in UTF-8, it needs to be in Py
The header line of the file tells the Py interpreter what encoding the file is.
What the interpreter reads is not the text data in the editor we see, but 01 stored on the hard disk
The same binary data, the interpreter tries to use the default UTF-8 encoding to decode the binary number read to the hard disk
According to data, converted to file data, if it is not the default utf-8, garbled characters appear, and the interpreter fails to parse the text data.
You need to specify the encoding format of the current file at the beginning of the Py source file, and tell the Py interpreter how to convert the file
The default encoding of Py interpreter is Unicode, and the interpreter will convert the binary data read through character encoding.
Change to file data and then convert to Unicode again, as long as the operating system supports Unicode, the interpreter
Can execute normally and output the result
Binary Data -> Check Character Encoding Table -> Text Data -> Unicode Encoded Text Data
Binary data -> check character encoding table -> text data corresponding to the encoding table
Both the interpreter and the editor start from the binary data of the file and convert it into the corresponding text through encoding
Data, but the interpreter will parse the text data into the underlying machine instructions based on the file data and execute
What needs to be clarified is that the encoding of the Py source file is inconsistent with the default encoding of the Py interpreter
The default encoding of Py source files is UTF-8, and the default encoding of Py interpreter is Unicode
Then, the idea of solving the problem that produces garbled codes is a good solution
Garbled-The character encoding is specified incorrectly, and the stored binary is converted into a text file. The selected character set is incorrect
1. For C/S architecture software, check whether the default codes of Client and Server are the same
2. Web back-end, whether the default encoding of the database, the encoding of the table, and the encoding of each language connection database interface are consistent
3. File, check whether the default encoding of the editor is consistent with the initial encoding of the file, and store whatever encoding is used for reading
The way Python declares the character encoding of the source file
1. # conding:utf-8
# 2. - - conding: UTF-8 - -
All start with # and are written on the top line of the source file
# -*- coding:utf-8 -*- # coding: utf-8
This article is reproduced, the copyright belongs to the author, if there is any infringement, please contact the editor to delete it!
Original address: www.tuicool.com/articles/Rz...