|
也许国内早就有了这些东西,但我并不知道,因为我对于Linux游戏编程的学习刚刚起步(也没有其它平台如win下游戏编程的经验),发现所遇到的困难远远超过了自己的想象,之后才知道象这种基础理论对于游戏等软件的开发是非常重要的。一下子翻译不完,现在将资料帖出来,有能力的朋友可以帮忙翻译,我随时在帖子里更改。
资料如下:
[code:1]
固定逻辑帧速率演示
( a fixed logic frame rate demo )
----------------------------------------------
这个SDL程序实例 - 一个简单的平台游戏 -
演示了固定虚拟帧速率和插补的用法,其平滑
和精确的游戏逻辑独立于实际渲染的帧速率。
( This SDL programming example - a simple
platform game - demonstrates the use of
a fixed virtual logic frame rate
together with interpolation, for smooth
and accurate game logic that is
independent of the rendering frame rate. )
这个实例还演示了卡通动画和局部显示更新
技巧,适用于游戏和需要高帧速率的应用程
序而不须在每一帧中更新整个屏幕。
( The example also demonstrates sprite
animation and partial display updating
techniques, suitable for games and
applications that need high frame rates
but can do without updating the whole
screen every frame. )
固定逻辑帧速率
( Fixed Logic Frame Rate )
----------------------
有一个固定逻辑帧速率意味着游戏逻辑(即按照对象行为
和用户操作定义的游戏规则)在每单位时间内以固定的次
数运行,这使得用"帧数"当作时间单位成为可能。
( Having a fixed logic frame rate means that the game
logic (that is, what defines the gameplay in terms
of object behavior and user input handling) runs a
fixed number of times per unit of time. This makes
it possible to use "frame count" as a unit of time. )
更有趣的是,一旦逻辑帧速率被设为一个足够大的数值
(比如说,20Hz的慢节奏游戏却用了100Hz的迅速动作)
逻辑代码还是会精确地每帧运行一次。
因此,它不需要计算时间差、解决平衡、考虑速度,加
速、突发、阻塞等等。
( More interestingly, since the logic frame rate
can be set at any sufficient value (say, 20 Hz for
a slow turn based game, or 100 Hz for fast action)
the logic code will run exactly once per frame.
Thus, there is no need to take delta times in
account, solving equations, making calculations on
velocity, acceleration, jerk and stuff like that. )
You can just deal with hardcoded "step" values and
simple tests.
Perhaps most importantly, you can *still* rely
on the game behaving *exactly* the same way,
regardless of the rendering frame rate or other
system dependent parameters - something that is
virtually impossible with delta times, since you
cannot have infinite accuracy in the calculations.
虚拟逻辑帧速率
( Virtual Logic Frame Rate )
------------------------
By "virtual", I mean that the actual frame rate is
not necessarily stable at the nominal value at all
times. Rather, the *average* logic frame rate is
kept at the nominal value by means of controlling
the number of logic frames processed for each
rendered frame.
也就是说,如果实际渲染速率低于所谓的逻辑帧速率,
引擎会在实际渲染之前运行若干次游戏逻辑,所以游戏
逻辑可以在几帧之间一下子就运行了几十kHz,但这不会
有什么问题,因为游戏逻辑代码完全依赖于逻辑时间。
( That is, if the rendering frame rate is lower
than the nominal logic frame rate, the engine will
run the game logic several times before rendering
each frame. Thus, the game logic may actually be
running at tens of kHz for a few frames at a time,
but this doesn't matter, as long as the game logic
code relies entirely on logic time. )
所以请不要在游戏逻辑代码中使用诸如SDL_GetTicks()之
类的来读取时间!应改为只计算逻辑帧数,就象回到C64和
Amiga的时代,视频帧就是可靠的时间单位,它确实有效!
( So, do not try to read time using SDL_GetTicks()
or similar in the game logic code! Instead, just
count logic frames, like we did back in the C64 and
Amiga days, where video frames were actually a
reliable time unit. It really works! )
采样失真
( Resampling Distortion )
---------------------
现在,固定逻辑帧速率存在一个问题:采样失真(表现在
播放声音波形时会在一定程序上导致声音引擎产生尖叫
和振鸣声)。
( Now, there is one problem with fixed logic frame
rates: Resampling distortion. (The same phenomena
that cause poor audio engines to squeal and feep
when playing back waveforms at certain pitches.) )
对象的坐标由游戏逻辑引擎产生,可以看作是具有固定
采样率的信号描述数值流(按照电子工程/数字信号处理
标准),每个坐标就是一个流。
( The object coordinates generated by the game
logic engine can be thought of as streams of values
describing signals (in electrical engineering/DSP
terms) with a fixed sample rate. Each coordinate
value is one stream. )
一旦逻辑帧速率固定了,游戏逻辑也会在每次渲染中以
正整数次运行,而我们得到的是从逻辑帧速率到渲染帧
速率"最近点"的采样率,这样不是太好,因为每次运行
逻辑帧后实际上只有最后一套坐标被使用 - 其余的都
被丢弃了!
( Since the logic frame rate is fixed, and the
game logic runs an integer number of times per
rendered frame, what we get is a "nearest point"
resampling from the logic frame rate to the
rendering frame rate. That's not very nice, since
only the last set of coordinates after each run of
logic frames is actually used - the rest are thrown
away! )
更糟的是,特别是当渲染帧速率高于逻辑帧速率时,你只
能间或地取得新坐标。
( What's maybe even worse, especially if the logic
frame rate is low, is that you get new coordinates
only every now and then, when the rendering frame
rate is higher than the logic frame rate. )
平滑动画
( Getting Smooth Animation )
------------------------
那我们该怎么做? 好,正如我曾提示的,很显然答案就是:
插值! 我们只需用更好的方法来取代基本的"最近采样"。
( So, what do we do? Well, given my hint above, the
answer is probably obvious: interpolation! We just
need to replace the basic "nearest sample" method
with something better. )
采样于声音领域来说是一门学问和艺术,论述这个主题的
书籍可说是不计其数,而其中的大部分对于任何一个没有
取得数学学位的人来说是相当费解的。
( Resampling is a science and an art in the audio
field, and countless papers have been written on
the subject, most of which are probably totally
incomprehensible for anyone who hasn't got a degree
in maths. )
然而,我们可以通过高度可控(如按预期的渲染帧速率来安
排)的逻辑帧速率来保持较低采样率的需求 - 这么做通常
都是为了减少潜在的游戏控制。
( However, our requirements for the resampling can
be kept reasonably low by keeping the logic frame
rate reatively high (ie in the same order of
magnitude as the expected rendering frame rate) -
and we generally want to do that anyway, to reduce
the game's control latency. )
选择插值器
( Chosing An Interpolator )
-----------------------
实际渲染帧速率总是以不可预料的方式不断变化,我们不
得不为每次渲染帧时重复计算采样过滤的输入/输出比率。
( Since the rendering frame rate can vary constantly
in unpredictable ways, we will have to recalculate
the input/output ratio of the resampling filter for
every rendered frame. )
不过我们也不用特意去做什么,仅用一个多项式插值器
(作为FIR采样过滤的对抗)就能解决了,只需把坐标和期
望的帧时间片注入插值器,就可以得到计算好的坐标值。
( However, using a polynomial interpolator (as
opposed to a FIR resampling filter), we can get
away without actually doing anything special. We
just feed the interpolator the coordinates and the
desired fractional frame time, and get the
coordinates calculated. )
DSP people will complain that a polynomial
resampler (that is, without a brickwall filter, or
oversampling + bandlimited downsampling) doesn't
really solve the whole problem. Right, it doesn't
remove frequencies above Nyqvist of the rendering
frame rate, so those can cause aliasing distortion.
But let's consider this:
Do we actually *have* significant amounts of
energy at such frequencies in the data from the
game logic? Most probably not! You would have to
have objects bounce around or oscillate at insane
speed to get anywhere near Nyqvist of (that is, 50%
of) any reasonable (ie playable) rendering frame
rate. In fact, we can probably assume that we're
dealing with signals in the range 0..10 Hz. Not
even the transients caused by abrupt changes in
speed and direction will cause visible side
effects.
因此在这个程序例子中我只用了一个简单的线性插值器,
没有过滤器,也没有过密采样或其它什么东西。尽管它
如此简单,但还是比"最近采样"有着难以置信的改进,
你可以在运行例子时使用F1键来启用/关闭插值。
( So, in this programming example, I'm just using
a simple linear interpolator. No filters, no
oversampling or anything like that. As simple as it
gets, but still an incredible improvement over
"nearest sample" resampling. You can enable/disable
interpolation with the F1 key when running the
example. )
卡通渲染
(Rendering Sprites)
-----------------
为了涵盖与卡通动画相关的FAQ,本例还包括了屏幕的
"智能"局部更新,只有受到移动和活动卡通影响的区域
才会被更新。
( In order to cover another animation related FAQ,
this example includes "smart" partial updates of
the screen. Only areas that are affected by moving
and/or animated sprites are updated. )
为了保持事情简单和易于实现,更新通过删除所有的卡
通来完成,更新它们的坐标和活动帧,然后重新渲染所
有的卡通。每帧都如此,包括所有的卡通,不管它们
是否移动。
( To keep things simple and not annoyingly non-
deterministic, updates are done by removing all
sprites, updating their positions and animation
frames, and then rendering all sprites. This is
done every frame, and includes all sprites, whether
they move or not. )
那为什么不只对确实移动的卡通进行更新呢?
( So, why not update only the sprites that
actually moved? )
That would allow for cheap but
powerful animated "backgrounds" and the like.
问题是卡通可能重叠,此时它们会互相拖入更新循环,
导致递归和潜在的依赖回环,一个非递归的两
(标记+渲染)算法可能是比实际循环好得多是主意,这
确实可行而且极为巧妙,???
( Well, the problem is that sprites can overlap,
and when they do, they start dragging each other
into the update loop, leading to recursion and
potentially circular dependencies. A non-recursive
two-pass (mark + render) algorithm is probably a
better idea than actual recursion. It's quite
doable and neat, if the updates are restricted by
clipping - but I'll leave that for another example.
Pretty much all sprites in Fixed Rate Pig move all
the time, so there's nothing to gain by using a
smarter algorithm. )
高效软件渲染
(Efficient Software Rendering)
----------------------------
为了更有趣些我还为卡通抗变形和特效添加了alpha渲
染,多数的2D图形API和驱动(SDL几乎都支持)都缺乏
alpha渲染blits的h/w加速,这意味着CPU不得不执行
渲染任务,那可是昂贵的花费,但SDL软件blit器相当
快,使得这不再是个问题。
(To make it a bit more interesting, I also added
alpha blending for sprite anti-aliasing and effects.
Most 2D graphics APIs and drivers (and as a result,
most SDL backends) lack h/w acceleration of alpha
blended blits, which means the CPU has to perform
the blending. That's relatively expensive, but
SDL's software blitters are pretty fast, and it
turns out *that's* usually not a problem. )
但是还有一个问题:alpha渲染需要数据从目标平面读出,
修改,然后回写,不幸的是现有显卡操作CPU从VRAM读取
数据差得可怜,CPU读的带宽 - 甚至是在最牛的AGP 8x
显卡 - 也只不过算是老牛拉破车(我不是在开玩笑!)。
( However, there is one problem: Alpha blending
requires that data is read from the target surface,
modified, and then written back. Unfortunately,
modern video cards handle CPU reads from VRAM very
poorly. The bandwidth for CPU reads - even on the
latest monster AGP 8x card - is on par with that of
an old hard drive. (I'm not kidding!) )
这正是我想要示范如何解决这个问题的原因,就是要渲染
到s/w后台缓冲而不是h/w显示平面。如果你的系统可支持
硬件显示平面,就可以在游戏中按F2键来启用或禁止VRAM
直连渲染来看到不一样的效果。
( This is why I wanted to demonstrate how to avoid
this problem, by rendering into a s/w back buffer
instead of the h/w display surface. If you're on a
system that supports hardware display surfaces, you
can see the difference by hitting F2 in the game,
to enable/disable rendering directly into VRAM. )
SDL确实可以这么设置,但这样的话你只能使用单缓冲显示
了 - 而我们并不希望如此! 单缓冲显示不能跟踪同步动画,
结果是我们挖尽CPU潜力(即只出帧而不阻塞)仍然得不到平
滑的动画。
( Indeed, SDL can set that up for you, but *only*
if you ask for a single buffered display - and we
do NOT want that! Single buffered displays cannot
sync animation with the retrace, and as a result,
we end up hogging the CPU (since we never block,
but just pump out new frames) and still getting
unsmooth animation. )
意外的是,使用s/w后台缓冲进行混合相当好,几近于
局部更新策略,因而完全可以胜任。
( Accidentally, this approach of using a s/w back
buffer for rendering mixes very well with partial
update strategies, so it fits right in. )
智能脏区管理
(Smart Dirty Rectangle Management)
--------------------------------
操作中最复杂的部分应该是要精确记录屏幕更新区域的
轨迹,只为卡通维护一个矩形区域还不够,一个卡通要
被删除,移动然后又再重画,至少需要向屏幕推入两个
矩形区域:一个用于删除旧的卡通,另一个则用于显示
其新的位置。
(The most complicated part of this implementation
is keeping track of the exact areas of the screen
that need updating. Just maintaining one rectangle
per sprite would not be sufficient. A moving sprite
has to be removed, animated and then re-rendered.
That's two rectangles that need to be pushed to the
screen; one to remove the old sprite image, and one
for the new position. )
在双缓冲显示中,这会变得更糟,因为渲染会在两个交互
的缓冲区中完成,当更新缓冲区时,旧的卡通一下子变成
了两个 - 而不是一个。
( On a double buffered display, it gets even worse,
as the rendering is done into two alternating
buffers. When we update a buffer, the old sprites
in it are actually *two* frames old - not one. )
我实现了一个"智能"矩形区域合并的算法,可以在高层面
上的很小支持就能摆平所有问题。这个算法合并矩形区域
以最小化blit时和更新屏幕时的重画和矩形区域数量。你
可以阅读dirtyrects.txt文件以获得更详细的信息,并在
游戏中按F3键看看到底是怎么回事。就象这样:
( I've chosen to implement a "smart" rectangle
merging algorithm that can deal with all of this
with a minimum of support from higher levels. The
algorithm merges rectangles in order to minimize
overdraw and rectangle count when blitting to and
updating the screen. See the file dirtyrects.txt for
details. You can (sort of) see what's going on by
hitting F3 in the game. Here's what's going on: )
1、所有卡通从缓冲区中移除,所需信息可以从保存
插值结果的变量中找到。
( 1. All sprites are removed from the rendering
buffer. The required information is found
in the variables that store the results of
the interpolation. )
2、显示平面的脏区表被交换到工作脏区表,清除显
示平面脏区表。
( 2. The dirtyrect table for the display surface
is swapped into a work dirtyrect table. The
display surface dirtyrect table is cleared. )
3、计算新的图像坐标,所有卡通渲染到当前渲染缓
冲区,边界矩形区注入显示平面脏区表。
( 3. New graphic coordinates are calculated, and
all sprites are rendered into the rendering
buffer. The bounding rectangles are fed
into the display surface dirtyrect table. )
4、经过步骤3编译的脏区表合并到工作脏区表,其结
果包括了删除旧卡通和产生新可视卡通的所有需
要更新的区域。
( 4. The dirtyrect table compiled in step 3 is
merged into the work dirtyrect table. The
result covers all areas that need to be
updated to remove old sprites and make the
new ones visible. )
5、第4步骤编译好的脏区表从渲染缓冲区传送到显示
平面。
( 5. The dirtyrect table compiled in step 4 is
used to blit from the rendering buffer to
the display surface. )
使用双缓冲显示时,每个显示页各自拥有一个脏区表,
很显然在步骤5之后要有一个页面翻转的操作,但其它
地方算法是一样的。
( On a double buffered display, there is one
dirtyrect table for each display page, and there
is (obviously) a page flip operation after step 5,
but other than that, the algorithm is the same. )
[/code:1] |
|