interesting article. good one. [C Is Not a Low-level Language. Your computer is not a fast PDP-11. 2018-04-30 By David Chisnall. At https://queue.acm.org/detail.cfm?id=3212479 ]
so, in recent years, i became more aware that programing lang or any algorithm REQUIRES a machine to work with. This is the CPU design. this is fascinating. what machine u have changes the nature of ur lang entirely. it also changes the profile of efficiency n type of tasks.
if you are writing pseudo code, or hand waving in theory, you no need machine. but as soon you actually do compute, a painstaking nittygritty lang becomes necessary, and, an actual machine the lang is to to work on.
the design of a machine dictates what your lang's gonna be like, and what kinda tasks it does well. e.g. lisp had lisp machines. Java talked bout java cpu ~1999. (note java runs on Java VIRTUAL MACHINE, which in turn is translated to actual machine).
Turing tape, formal lang, automata, are abstract concrete machines. Integers, or set theory, are also abstract concrete machines, that math works on. As soon as you talk about step by step instruction (aka algorithm), you need a machine.
virtual abstract machine is easy. But actually running hardware, we have fingers to stones to metal cogs n wheels to vacuum tubes to solid state transistors to printed circuits to incomprehensible silicon die. and there is dna n quantum machine
if we just think about cpu/gpu of current era. there are lots designs aka architectures. i dunno nothing about this, but the question is, what are the possible designs, pros n cons, what are the barrier to new design, how it effect the language and computing tasks we do today.
a machine, can be abstracted into “Instruction Set Architecture”. i.e. a set of instructions. These makes up a machine language, the lowest form of programing language. Each instruction do things like move this stone over there, or put this number here, add 2 numbers, etc.
one example of Instruction Set Architecture (ISA), is the x86 instruction set. Given a ISA, there can be many diff implementation in hardware. A design of such implementation, is called microachitecture. it involves the actual design and make of the circuits and electronics.
so now, there r 2 interesting things bout designing a machine to run comp lang. 1 is Instruction Set Architecture, n sub to it is so-called microachitecture, which is the actual design and building of the machine, with restrains of electronics or physical reality.
for our interest about programing language design and its bondage by the design of hardware it runs on, the interesting thing is Instruction set architecture. see https://en.wikipedia.org/wiki/Instruction_set_architecture and learn about CISC RISC other design issues.
i guess, design of cpu, has as much baggage as comp lang of human habit. 4 lang, it has to wait till a generation die off. For cpu, the thing is corp power struggle to keep (their) design stay put, and hard barrier to entry. (of course, there is also backward compatibility, for both)
so now when we see programing language speed comparison, we can say, in fact technically true, that it's not because #haskell or functional programing are slower than C, but rather, the cpu we use today are designed to run C-like instructions.
just from guessing, i thought it'd be strange for full lang (like today's standard) to use a stack as in HP calculator. because, stacks is fine as data structure, but hard to use, as one has to push pop or swap constantly... i imagine, Forth is not like the lang on HP calcs, cuz that seems too primitive? maybe at the time it's not, like Basic?
apparently, there is a completely different cpu design, 1980s, called Transputer. The lang for it is occam. very nice. The lang is more tight to the cpu than C is to x86 chips. thx to @mmzeeman
a good ram chip will sometime flip a bit, due to cometic ray. This is partly why, Error-correcting code memory (ECC memory) is invented. ECC is used in intel xeon chip. xeon is x86 compatible but with added niceties such as ecc, more core, more ram cache etc.
in this foray of relations of algorithm and machine, i learned quite a lot things. many familiar terms about cpu: ISA x86 XEON CISC RISC MIPS FLOPS ecc superscalar, instruction level parallelism, now i understand.
before, whenever new chip comes into pop product, such as Apple switching to PowerPC in 1994, then to Intel in 2006, or the arm cpu in smartphones, i read that all software needs to be rewritten. was always annoyed and shocked, and don't know why. now i know.
because: algorithm is fundamentally meaningful only to a specific machine. Note, new cpu design (aka instruction set architecture) will be inevitable. That means, all software, needs to be rewritten again. (practically, needs to be recompiled. the compilers need to be rewritten. it may take years to mature.)
what if we don't rewrite instead rely on a emulation layer? A magnitude slower! That's why, IBM Deep Blue chess had its own dedicated chips, audio want its own processor (DSP), Google invent its own cpu for machine learning. and vid games need GPU.
back in 1990s, i see some SGI machine demo of its 3d prowess. Or some graphics cards. But when rendering 3D raytracing , it has 0 speedup. I thought wtf? how is it possible, some $fantastic$ graphics machine yet useless when rendering 3d? was a great puzzle to me.
the tasks GPU is designed to do is in general different from raytracing 3d scenes. Video processing, 3d graphics, 2d graphics, n other comp in video games, actually are diff math computation. They r all ~ linear algebra, but the specifics a chip is designed for is big diff.
@jeremiah wasn't it a modern advance, to actually use gpu to do general computation? ...
i still don't understand fully, how exactly gpu diff from cpu, or why can't cpu be just gpu, or how exactly one use gpu to do general comp. I know the basic principles... but i think one'd need to dip into actual experience in designing chips, or writing specific compilers.
@xahlee On the abstract level, it's another architecture -- but to get specific -- it's an architecture that is specialized to do very specific kinds of jobs. There happens to be lots of them, giving them sort of "mutant powers" relative to a normal CPU, and limitations (they're not meant to be a CPU, of course.)
Read up on CUDA. If you're interested, I'll dig up some other stuff for you.
@xahlee also think of the history of GPU. First, that meant something that did blitter, sprites, high speed ops to offload tedium from 7-100mhz CPUs -- look at old Amiga's custom graphics chips as the final phase of this evolution.
Then came the "graphics pipeline" -- that was the idea that SGI engineered. That created the OpenGL standard (and hardware protoform) that became the accelerated _3D_ graphics we knew in the late 90s.
Today, GPU is super MMU, multicore 3D + video processing + 2D...
@jeremiah just read CUDA , just the 1st paragraph yet. It made so much sense now! basically a API to the GPU! ... ah, so, from a programer point of view, that's how one uses gpu for general computation!
@xahlee ... that's how one does it relatively easily. Before that, it was like graphics card hacking, but people were doing it (to my knowledge) as early as 2004. CUDA standard came out in 2007.
@xahlee ... that's how one does it relatively easily. Before that, it was like graphics card + driver hacking, but people were doing it (to my knowledge) as early as 2004. CUDA standard came out in 2007.
@xahlee@jeremiah If you want to understand digital logic, there's no substitute for designing it. Hardware Description Languages, like Verilog, can be used to design digital hardware. If you get tired of software tests or want to take advantage of hardware speed, you can even load your design into FPGAs, a kind of programmable hardware.