Previous Next

Assembly Language Reimagined Programming the Intel X64 Microprocessor in Linux (John Schwartzman) (z-library.sk, 1lib.sk, z-lib.sk)

Author: John Schwartzman

代码

Learning assembly language won’t make you a faster programmer. It won’t enable you to create portable, write-once, run-anywhere programs. So why learn it? The answer is that it will make you a better programmer. Author John Schwartzman takes a fresh look at low-level programming and explores how to write programs using the BIOS and glibc. This laboratory-based book aids the writing of high-level structured programs by showing what the processor can and can’t do and how it does it. You’ll take apart high-level structured C/C++ and show what the CPU is doing at every stage of the program. The book introduces programs and activities throughout the development process, providing sample code, makefiles, and shell scripts for each example program. With the help of Assembly Language Reimagined you’ll become a more capable and versatile computer engineer. What You will Learn Explore a new perspective on the Intel x64 microprocessor for low-level programming Understand what a processor is doing while a high-level structured computer language program is being run Solve problems with the help of software. See why assembly language programming is essential for every serious student of computer science Who This Book Is For Embedded Linux and Assembly developers, engineers and programmers, hobbyists from the Maker community, as well as college and graduate level students who have some prior knowledge of a structured high-level language like C or C++

📄 File Format: PDF
💾 File Size: 8.8 MB
9
Views
0
Downloads
0.00
Total Donations

📄 Text Preview (First 20 pages)

ℹ️

Registered users can read the full content for free

Register as a Gaohf Library member to read the complete e-book online for free and enjoy a better reading experience.

📄 Page 1
(This page has no text content)
📄 Page 2
Assembly Language Reimagined Programming the Intel x64 Microprocessor in Linux John Schwartzman
📄 Page 3
Assembly Language Reimagined: Programming the Intel x64 Microprocessor in Linux ISBN-13 (pbk): 979-8-8688-1723-6 ISBN-13 (electronic): 979-8-8688-1724-3 https://doi.org/10.1007/979-8-8688-1724-3 Copyright © 2025 by John Schwartzman This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Managing Director, Apress Media LLC: Welmoed Spahr Acquisitions Editor: James Robinson Prior Development Editor: Jim Markham Coordinating Editor: Gryffin Winkler Cover image designed by Freepik (www.freepik.com) Distributed to the book trade worldwide by Springer Science+Business Media New York, 1 New York Plaza, New York, NY 10004. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.com, or visit www.springeronline.com. Apress Media, LLC is a Delaware LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation. For information on translations, please e-mail booktranslations@springernature.com; for reprint, paperback, or audio rights, please e-mail bookpermissions@springernature.com. Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also available for most titles. For more information, reference our Print and eBook Bulk Sales web page at http://www.apress.com/bulk-sales. Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub: https://www.apress.com/gp/services/source-code. If disposing of this product, please recycle the paper John Schwartzman Ellicott City, MD, USA
📄 Page 4
2 Kate, 4 Everything.
📄 Page 5
v Table of Contents About the Author ����������������������������������������������������������������������������������������������������� ix About the Technical Reviewer ��������������������������������������������������������������������������������� xi Acknowledgments ������������������������������������������������������������������������������������������������� xiii Preface ��������������������������������������������������������������������������������������������������������������������xv Chapter 1: Using BIOS Services �������������������������������������������������������������������������������� 1 What Is the BIOS? ������������������������������������������������������������������������������������������������������������������������� 1 Getting Started ������������������������������������������������������������������������������������������������������������������������������ 1 The Anatomy of a Makefile ����������������������������������������������������������������������������������������������������������� 5 Running the DDD Debugger �������������������������������������������������������������������������������������������������������� 10 Activities ������������������������������������������������������������������������������������������������������������������������������������� 13 Chapter 2: Extending BIOS Services ����������������������������������������������������������������������� 15 A Brief Introduction to Boolean Logic Gates ������������������������������������������������������������������������������� 20 Representation of Numbers in the Computer ������������������������������������������������������������������������ 22 The DDD Debugger ���������������������������������������������������������������������������������������������������������������� 26 Activities ������������������������������������������������������������������������������������������������������������������������������������� 28 Chapter 3: Prefer glibc over BIOS Calls, uname Reprise ���������������������������������������� 29 The Stack ������������������������������������������������������������������������������������������������������������������������������������ 29 The C Calling Convention ������������������������������������������������������������������������������������������������������������ 31 The Linker ����������������������������������������������������������������������������������������������������������������������������������� 32 Data Sections ������������������������������������������������������������������������������������������������������������������������������ 37 The uname2�asm Program ���������������������������������������������������������������������������������������������������������� 37 The os-distro�sh Shell Script ������������������������������������������������������������������������������������������������������� 38 Activities ������������������������������������������������������������������������������������������������������������������������������������� 40
📄 Page 6
vi Chapter 4: Passing Information to a Program on the Command Line �������������������� 43 The DDD Debugger ���������������������������������������������������������������������������������������������������������������������� 52 Activities ������������������������������������������������������������������������������������������������������������������������������������� 52 Chapter 5: Using Macros and Passing Arguments on the Stack ���������������������������� 55 More About Macros ��������������������������������������������������������������������������������������������������������������������� 62 Activities ������������������������������������������������������������������������������������������������������������������������������������� 64 Chapter 6: Conditional Compilation and Conditional Build ������������������������������������� 67 The DDD Debugger ���������������������������������������������������������������������������������������������������������������������� 74 Activities ������������������������������������������������������������������������������������������������������������������������������������� 75 Chapter 7: Recursion ���������������������������������������������������������������������������������������������� 77 Activities ������������������������������������������������������������������������������������������������������������������������������������� 85 Chapter 8: Using Floating Point Registers �������������������������������������������������������������� 87 Activities ������������������������������������������������������������������������������������������������������������������������������������� 92 Chapter 9: The commaSeparate Utility ������������������������������������������������������������������� 95 Activities ����������������������������������������������������������������������������������������������������������������������������������� 103 Chapter 10: The hhmmss Utility Program ������������������������������������������������������������ 105 Activities ����������������������������������������������������������������������������������������������������������������������������������� 112 Chapter 11: Creating and Using a Shared Library ������������������������������������������������ 113 Activities ����������������������������������������������������������������������������������������������������������������������������������� 120 Chapter 12: Sorting an Array of Integers �������������������������������������������������������������� 123 Activities ����������������������������������������������������������������������������������������������������������������������������������� 135 Chapter 13: Sorting an Array of Strings ��������������������������������������������������������������� 139 Activities ����������������������������������������������������������������������������������������������������������������������������������� 150 Searching an Array of Strings ��������������������������������������������������������������������������������������������������� 152 Activities ����������������������������������������������������������������������������������������������������������������������������������� 156 Chapter 14: Finding, Reading, and Selecting File and Directory Metadata ���������� 157 Activities ����������������������������������������������������������������������������������������������������������������������������������� 163 Table of ConTenTs
📄 Page 7
vii Chapter 15: Creating and Sorting a Linked List ���������������������������������������������������� 165 Activities ����������������������������������������������������������������������������������������������������������������������������������� 181 Chapter 16: Reading and Sorting File and Directory Information by Reading Directories ������������������������������������������������������������������������������������������������������������ 183 Activities ����������������������������������������������������������������������������������������������������������������������������������� 212 Chapter 17: Reading File and Directory Information with the Help of the Linux Shell Scripting Language, BASH ����������������������������������������������������������������� 215 Activities ����������������������������������������������������������������������������������������������������������������������������������� 219 Afterword �������������������������������������������������������������������������������������������������������������� 221 Appendix A: Installing the Software ��������������������������������������������������������������������� 223 Glossary ���������������������������������������������������������������������������������������������������������������� 227 Endnotes ��������������������������������������������������������������������������������������������������������������� 233 Index ��������������������������������������������������������������������������������������������������������������������� 235 Table of ConTenTs
📄 Page 8
ix About the Author John Schwartzman is a hardware/software engineer with over 40+ years of industry and teaching experience of hands-on coding and design. He has managed groups in tech companies large and small and is a regular writer for Linux Magazine and Linux Format.
📄 Page 9
xi About the Technical Reviewer Seth Kenlon is a sysadmin, open source and free culture advocate, Java and Lua programmer, game designer, and tabletop gamer. He has worked at tech startups, at Weta Digital on movies, and is currently employed at Red Hat/IBM as a professional Linux geek.
📄 Page 10
xiii Acknowledgments Thanks are gratefully given to Gryffin Winkler and his review team at Apress. Jim Markham provided much helpful feedback.
📄 Page 11
xv Preface I started writing this book with the idea that knowledge of assembly language programming would be useful to software engineers. I finished this book with the firm conviction that knowledge of assembly language programming is essential for every serious student of computer science. Writing this book has made me a better computer engineer. I think that working through this book will elevate and inform your own programming. The emphasis in this book is on what the processor is doing while a high-level structured computer language program is being run. The book describes what a computer can, and can’t, do and how it can solve problems with the help of software. This book applies to the entire field of computer science and not just the somewhat narrow area of assembly language programming. What Is Assembly Language? Assembly language is a low-level programming language: it’s specific to a particular processor. It is used to program a specific processor at the hardware level. Compilers for languages like C, C++, Pascal, FORTRAN, BASIC, COBOL, Python, Go, Rust, etc., understand assembly language because that’s what they use to break down a high- level language program into its equivalent low-level assembly language instructions. Assembly language is relevant to all high-level computer languages. C++ is used to create many different programming languages. The C++ compiler strings together lots of assembly language instructions to do the actual work. Every kind of program ultimately executes machine language on the computer hardware. Why Learn Assembly Language? Learning assembly language won’t make you a faster programmer. It won’t enable you to create portable, write-once, run-anywhere programs. It’s not new. It’s not object oriented. So why learn it? The answer is that it will make you a better programmer. By learning just what a processor can and can’t do, you gain a deeper understanding of computer science.
📄 Page 12
xvi A processor doesn’t just perform arithmetic; it also performs Boolean logic operations. Understanding Boolean logic operations teaches you about the Boolean logic gates inside the central processing unit (CPU) of your computer. It shows you the way in which your programs use these logic gates to make decisions about program flow. Understanding this will make you a more capable and versatile computer engineer. What Are Mnemonics? Mnemonics are simple names or abbreviations given to numeric machine language instructions. Assembly language is simply machine language with mnemonics substituted for numeric op codes. A processor contains hundreds of numeric instructions (op codes). Assembly language allows you to write programs using mnemonics like ADD, SUB, MOV, JMP, AND, OR, and so on. A mnemonic like MOV (move) can indicate many different types of moves. We can move a number from a memory location to a register (a named, very fast, 64-bit memory location located inside the CPU), from a register to a memory location, from one register to another, from an immediate data value specified in the program to a register, etc. These are known as addressing modes, and they are determined by the arguments supplied with a mnemonic in an assembly language statement. We use mnemonics because, as you can imagine, they’re a lot easier to remember than numeric op codes. What Is Embedded Programming? In embedded programming, we regulate voltages and currents in a physical machine (hardware) that contains a CPU and memory. In many cases, a hardware device may have time constraints associated with it. When reading from the read sensor of a spinning disk, you only have a small interval of time after you process a bit before the next bit arrives and must be processed. The bits are stored serially on the disk. As bit rates increase, the interval of time between bits falls and the software must respond more quickly than before. At some point, a high-level language will fail to respond quickly enough to reassemble the bits into bytes, and assembly language must be employed in its stead. PrefaCe
📄 Page 13
xvii Assembly language is very useful in embedded programming because it is closer to the physical machine than a high-level language. It is very easy to include assembly language modules with high-level language modules to facilitate the development of embedded systems. What Are High-Level Computer Languages? High-level languages make programs portable. They enable you to program without worrying about the low-level details of how a CPU works, and they don’t appear to care what CPU is inside your computer. They provide the abstraction that lets you think about problems at a higher level. Every computer must, however, have high-level language compilers written specifically for the CPU of that computer. The executable code is written for one specific CPU, only. Object-oriented languages like C++, Java, C#, Python, and so on are “higher” high-level languages that enable you to incorporate the problem-domain into your program. The programs you create with an object-oriented language “understand” your application’s problem-domain and not just the generic algorithms used to operate on data. For example, a student program written for a university will have a function to compute grade point average. But assembly language is at the heart of every high-level language. The machine code which runs on your computer is created by the assembler, which is inside every high-level language compiler. Why Study the Intel x86_64 64-Bit Microprocessor? A great number of the computers in homes, offices, factories, schools, and laboratories employ Intel x86_64 CPUs, so in this book we’re writing assembly language in the Intel x86_64 dialect. We’re using the Linux operating system and so we start by writing a couple of simple assembly language programs that use Linux kernel services. These are low-level services that the operating system makes available to compilers and assemblers. In Chapter 3, we’ll begin using the C run-time library (glibc) to write more complicated programs. In many cases, the glibc methods are thin wrappers around the Linux kernel services. Although this discussion is restricted to the Intel x86_64 64-bit CPU, it is very easy to transfer a knowledge of assembly language programming to other PrefaCe
📄 Page 14
xviii processors. With other processors, you will encounter slightly different mnemonics, but the basic instructions and addressing modes are identical. In general, we present each example assembly language program with its C or C++ language equivalent. Why Linux? This text focuses on the Linux operating system. Linux is an open source, multiuser UNIX-like operating system. It is not tied to a specific manufacturer like Microsoft or Apple. You don’t have to pay a licensing fee to use Linux. It is a popular choice for web servers, supercomputers, and higher education. What Is a Makefile? The programs we build come in two basic flavors, release and debug. Debug is good for writing and testing. Release mode produces a smaller executable. We use release mode for distribution and debug mode for development. Release mode programs start from the address of the first instruction in memory, executing each instruction in memory until the last instruction is reached, at which point we return to the BIOS. Debug mode programs can be run just like release mode programs, but they can also be run inside a debugger. A debugger is a multithreaded program that lets you stop the program under test at critical points and examine the contents of registers, variables, instructions, and memory. It is invaluable for understanding and explaining how a computer program works. It’s also invaluable for finding and correcting logic errors in your code. Syntax errors and typing errors are usually caught in the edit-make cycle, but many logic errors require a debugger and a skilled debugger-user to unravel. A makefile can create both debug and release forms of the executable for you. This textbook provides a makefile for every chapter. The makefiles will build each example program in release mode or in debug mode. Sometimes you will have to modify the makefile to produce additional executables. Executing the debug or release executable will help demonstrate the concepts of the chapter. Release mode strips out the debugging information (line number references to the source code) from the final executable file. During the development process, we usually build in debug mode. PrefaCe
📄 Page 15
xix Appendix A tells you how to download the editor, assembler, compilers, make utility, debugger, source code, and other tools you’ll need to complete the examples and activities in this book. The Need for Speed The microprocessor CPU operates on a square wave clock waveform. The clock runs at somewhere around 3G CPS (3 billion cycles per second) on a personal computer, where every clock cycle lasts approximately 0.33 ns (one-third of a billionth of a second per cycle). The speed at which a program runs depends upon which component of the computer is doing the work at the present moment. The fastest components are the registers inside the CPU. An instruction that moves data from one register to another takes one clock cycle, because all of the action takes place inside the CPU. Moves from a register to memory take longer because we have to wait for the memory address to stabilize on the address bus and data to stabilize on the data bus and for the memory device to be selected. Immediate mode operations like mov rax, 0 (move zero into the rax register) take about the same time as a register to memory transfer. Hard disk to register moves take longer still, since we have to wait for the disk to spin up and to be accessed. Other IO (input/output) hardware operations take longer still. The computer does its best by caching program memory in high-speed memory cache built into the CPU. It also predicts what future accesses will be needed by caching instructions following a jump. It can load instructions speculatively into cache so that it has cached instructions no matter which way the jump goes. Modern CPUs can have many cores, so that instructions can be executed in parallel if the software supports this. Modern CPUs also support pipelining, in which the simultaneous execution of more than one instruction takes place. Although you can see that the computer itself has multiple methods for increasing program speed, assembly language programming is another way to increase speed. No high-level computer language offers the fine-grained control of assembly language. PrefaCe
📄 Page 16
xx CISC vs. RISC A Complex Instruction Set Computer (CISC) like the Intel x86_64 has many complicated instructions that take much longer than a single clock cycle to execute. A Reduced Instruction Set Computer (RISC), on the other hand, tends to avoid complicated instructions and strives to execute every instruction in its instruction set in fewer clock cycles than its CISC counterparts. The assumption is that any deficiencies in the instruction set will be taken care of in software. Assembly language helps achieve the best performance from either architecture. Practice Makes (Nearly) Perfect You learn a computer language through constant practice, so I have included many questions and activities in each chapter. I had a lot of fun writing this textbook. I hope you find it helpful, relevant, and enjoyable! I hope it gives you a more intuitive understanding of high-level structured code. John Schwartzman January 2025 PrefaCe
📄 Page 17
1 © John Schwartzman 2025 J. Schwartzman, Assembly Language Reimagined, https://doi.org/10.1007/979-8-8688-1724-3_1 CHAPTER 1 Using BIOS Services What Is the BIOS? The Basic Input Output System (BIOS) is a non-volatile program burned into read- only memory (ROM). It is available as soon as the computer is powered on. It contains instructions for interfacing to the hardware peripherals as well as instructions for loading the operating system. The BIOS provides a low-level interface to hardware peripherals. The UEFI (Unified Extensible Firmware Interface) is a newer open source replacement for the BIOS. When the computer is powered on, all of its intelligence is in the BIOS or the UEFI. As part of the boot process, the operating system is loaded into RAM (random access memory) and executed by the BIOS or UEFI. Getting Started It has become obligatory to introduce every new programming language with a program that prints “Hello, world!” to the console, so we’ll start there.1 Listing 1-1 shows the C language code hello.c. Listing 1-2 shows the equivalent assembly language code, hello.asm. Listing 1-3 shows the make file, Makefile, which is used to build (assemble, compile, and link) the two programs in this chapter. Listing 1-4 shows the maketest.sh shell script that is used in every build. The hello executable is made by compiling and linking hello.asm, which calls two BIOS subroutines to do its work. The a.out executable is made from hello.c and two GNU C library (glibc) subroutines. The Makefile contains instructions to assemble, compile, link, and create these two executable programs.
📄 Page 18
2 Listing 1-1. hello.c 1 // hello.c 2 // John Schwartzman, Forte Systems, Inc. 3 // 04/06/2023 4 5 #include <stdio.h> // declaration of printf 6 #include <stdlib.h> // defines EXIT_SUCCESS 7 8 int main() 9 { 10 printf("\nHello, world!\n\n"); 11 return EXIT_SUCCESS; 12 } Listing 1-2. hello.asm 1 ;====================================================================== 2 ; hello.asm 3 ; John Schwartzman, Forte Systems, Inc. 4 ; 04/09/2023 5 ; 6 ;======================== CONSTANT DEFINITIONS ======================== 7 LF equ 10 ; ASCII linefeed character 8 EXIT_SUCCESS equ 0 ; Linux apps normally return 0 for success 9 STDOUT equ 1 ; destination for SYS_WRITE 10 SYS_WRITE equ 1 ; kernel SYS_WRITE service number 11 SYS_EXIT equ 60 ; kernel SYS_EXIT service number 12 13 ;============================= CODE SECTION =========================== 14 section.text 15 global _start 16 17 _start: 18 mov rdi, STDOUT ; 1st arg to SYS_WRITE – where to write Chapter 1 Using BiOs serviCes
📄 Page 19
3 19 lea rsi, [msg] ; 2nd arg to SYS_WRITE - what to write 20 mov rdx, MSGLEN ; 3rd arg to SYS_WRITE – how much to write 21 mov rax, SYS_WRITE ; tell BIOS to call SYS_WRITE service 22 syscall 23 24 sub rax, MSGLEN ; syscall ret with rax = num bytes written 25 mov rdi, rax ; 1st arg to SYS_EXIT = 0 if MSGLEN char written 26 mov rax, SYS_EXIT ; prepare to call SYS_EXIT 27 syscall ; tell BIOS to call SYS_EXIT service 28 29 ;======================= READ-ONLY DATA SECTION ======================= 30 section.rodata 31 msg: db LF, "Hello, world!", LF, LF 32 MSGLEN: equ $-msg 33 ;====================================================================== Listing 1-3. Makefile for hello 1 ####################################################################### 2 # 3 # Makefile for hello 4 # John Schwartzman, Forte Systems, Inc. 5 # 04/06/2023 6 # 7 # Commands: make .release, .make debug, make clean 8 # make = make release 9 # Requires: ../maketest.sh 10 # 11 ####################################################################### 12 PROG := hello 13 SHELL := /bin/bash 14 15 .release: $(PROG).asm $(PROG).c Makefile Chapter 1 Using BiOs serviCes
📄 Page 20
4 16 @source ../maketest.sh && test .release .debug 17 yasm -f elf64 -o $(PROG).obj $(PROG).asm # hello.asm => hello.obj 18 ld $(PROG).obj -o $(PROG) # hello.obj => hello 19 gcc -O3 $(PROG).c # hello.c => a.out 20 21 .debug: $(PROG).asm $(PROG).c Makefile 22 @source ../maketest.sh && test .debug .release 23 yasm -f elf64 -g dwarf2 -o $(PROG).obj $(PROG).asm # hello.asm =>hello.obj 24 ld -g $(PROG).obj -o $(PROG) # hello.obj => hello 25 gcc -g $(PROG).c # hello.C => a.out 26 27 clean: 28 rm -f $(PROG) $(PROG).obj a.out .debug .release 29 ####################################################################### Listing 1-4. maketest.sh #!/bin/bash ########################################################################### # maketest.sh # John Schwartzman, Forte Systems, Inc. 06/03/2019 # # A makefile helper script to manage .debug and .release makefiles # using the same source, object and executable files. # In Makefile use: @source ../maketest.sh && test .release .debug # @source ../maketest.sh && test .debug .release # Invoke Makefile with make .release, make .debug # ########################################################################### function test() { if [[ ! -f $1 ]]; then touch $1; rm -f $2; Chapter 1 Using BiOs serviCes
The above is a preview of the first 20 pages. Register to read the complete e-book.

💝 Support Author

0.00
Total Amount (¥)
0
Donation Count

Login to support the author

Login Now

Recommended for You

Loading recommended books...
Failed to load, please try again later
Back to List