Von技術

  • Archive
  • RSS
  • Ask me anything

關於 EBS 增加容量的流程

最近面臨了 ebs 空間不足的問題,被我寫爆了..
於是紀錄一下流程..

1. 進 ec2 console, 選 volumes
2. 對現有的 ebs 做 snapshot
3. 將現有的機器關機, 然後 detach ebs
4. 用建好的 snapshot 來建立新的 ebs
5. 啟動機器
6. sudo resize2fs (然後等一下..)
7. df -h (應該就會看到空間變成你新的 ebs size)

    • #ebs
    • #ec2
    • #aws
    • #console
    • #snapshot
    • #resize2fs
  • 1 week ago
  • Comments
  • Permalink
  • Share

Cardinal Blue: 3 Must-Have SDKs for Mobile App Developers

cardinalblue:

Are you developing mobile apps? Here are 3 SDKs which we have found extremely helpful in deploying and debugging our PicCollage app for iOS/Android, which has been downloaded more than 5 million times.

  • Testflight helps you deploy iOS apps to your test users.
  • Crittercism helps you find…
  • 1 month ago > cardinalblue
  • 1
  • Comments
  • Permalink
  • Share

Nginx subdomain 轉址設定.

由於跨 domain 的 browser 安全性問題.
所以這幾天一直在考慮是否將網址直接用成 subdirectory 的方式.
所以試著在 nginx 下把 mydomain.com/abc 轉到 localhost:8080 .
但這看來很簡單很合邏輯,應該是加個

location/abc{
   proxy_pass http://localhost:5984;
}

結果卻完全不是這麼一回事…
所以google了兩個小時多終於找到,解法如下 :

location /couchdb {
  rewrite /couchdb/(.*)   /$1   break;
  proxy_pass http://localhost:5984;
  proxy_redirect off;
  proxy_set_header Host $host;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

請各位孝納.. :p ps, 原文在此 : Nginx as a reverse proxy

    • #nginx
    • #subdomain
    • #redirect
    • #proxy_pass
    • #proxy_redirect
  • 1 month ago
  • Comments
  • Permalink
  • Share

幾個關於CSS&JS的Memo

最近在調整 IE 的 CSS , 幾個關於 IE Buggy 的 Memo 如下 :

  • max-width 需使用 min-width 並加上 width: expression(this.width > your_size ? your_size : true);

  • padding無效.

  • 可以使用 \9 對 IE9 做特殊處理. 而 !important 則是對所有.

而 JS 的部份如下 :

  • jquery window key bind 不運作. 可能的話請對 dom 做 bind.

  • jquery getJSON 跨網域不運作. 請加上 jQuery.support.cors = true;

  • const 不運作. 請改用 var.

    • #ie
    • #css
    • #hack
    • #css3
    • #padding
    • #frontend
  • 2 months ago
  • Comments
  • Permalink
  • Share

Ruby中超強悍的 Non-Blocking 的框架 : Goliath

Goliath 是一套 ruby 的 asynchronous (non-blocking) 的 Framework .
它建構在 EventMachine, Rack 之上. 讓開發者很簡單的就能架設一些 API 或 non-blocking 的服務.
有使用過 Rails 的人都知道, 像 Thin, Mongrel 這類的 Server 是會 Block 住.
也就是說如果我們開了 3 個 Thin Instance ,而每個 Request 需時 1 秒.
當同時有四人連線到我們的 Server , 其中某一個就會因為 server 還在 handle 其他人的 request 而出現 error.

我們的API也是使用 Goliath 所架設. Goliath 簡單易用. 而且強大. 以目前我們實際使用單個 API Instance , 只有速度反應的快或慢. 沒有 block 或 error 的問題.

Goliath 個人認為也能拿來直接架設網站 Backend .
也就是後端完全使用 Goliath, 前端完全使用 apahce, nginx 來 render html+js+css .
如此,強悍無比.

建立 Routes

透過如下範例,我們就能建立一個簡單的 Goliath API, 具有 Get/Post Method.
如果在 Console 下輸入 ruby goliath_file.rb -p 8080 ,打開 Browser 連結 localhost:8080/hihi
就會看到 “hello world”.

class Hihi < Goliath::API
 def response(env)
   params=params=env['params']
   puts "params #{params}"
   [200,{'Content-Type' => "text/plian"}, "hello world."]
 end
end

class Api < Goliath::API
  use Goliath::Rack::Params
  get  '/hihi', Hihi
  post '/hihi', Hihi
end 
關於使用 Goliath 當服務的思考

其實所有的東西都可以在 Goliath 完成. 只是如此一來 Model 和 Controller 就比較緊密連接.
但只要設計得當,也是能分得很開.
透過加入 Datamapper, ActiveRecord, Sequel… etc 就能達成資料庫連結.
前端使用 Nginx,Apahce 來 render 網頁. 超級快!
如果要更快, 直接將網頁丟在cdn. 快速無比. :)

唯一目前讓我還沒有徹底使用的原因,很大部份是 Keep Session 的處理.
這事其實不算非常困難,但就是麻煩啊.. Ruby的愛用者好像都很懶.. :p

如果有人有興趣,看是對 Goliath 的哪個面向. 可以再多作介紹.. :)

    • #goliath
    • #eventmachine
    • #event driven
    • #non-blocking
    • #asynchronous
    • #ruby
    • #rails
    • #thin
    • #mongrel
    • #apahce
    • #nginx
    • #api
    • #restful
    • #model
    • #session
    • #controller
    • #mvc
    • #datamapper
    • #activerecord
    • #sequel
  • 2 months ago
  • Comments
  • Permalink
  • Share

使用EventMachine建立高Concurrent的Worker.

我們的 Worker 並不是使用 Beanstalk 或 AMQP 之類的處理. 雖然曾經有想過…
取而代之的是使用純粹的 EventMachine.

透過 EventMachine 的 Defer. 我們可以在同時處理非常大量的資料.
但同時也帶來了幾個問題.

  • 啟動及管理Worker :

    EventMachine 可以透過啟動 Web Server 的方式來綁定特定 Port. 然而如果是以 Worker 的方式執行卻難以維護.
    我們嘗試過幾個Gem比如 Daemon, Foreverb. 但都不盡理想.
    於是我們就看了一下比如 Thin, Goliath 等 Framework 內部是如何實作. 才得知,其實也就是使用 Fork Process 的方式. 透過 Fork Process ,將原本的 Process 砍掉. 作法如下 :

    def store_pid(pid) FileUtils.mkdir_p(File.dirname(@pid_file)) File.open(@pid_file, ‘w’) { |f| f.write(pid) } end

    Process.fork do Process.setsid exit if fork

    @pid_file ||= ‘./your_code_name.pid’ @log_file ||= File.expand_path(‘your_code_name.log’) store_pid(Process.pid)

    File.umask(0000)

    stdout_log_file = “#{File.dirname(@log_file)}/#{File.basename(@log_file)}_stdout.log”

    STDIN.reopen(“/dev/null”) STDOUT.reopen(stdout_log_file, “a”) STDERR.reopen(STDOUT)

    #Start to run your code or worker here. end

    這裡必需提一下.
    Foreverb 這個 Gem 在幾次Von和原作者信件往返,得知如 Twitter ,等內部很多地方都用到這個 Gem.
    而 Von 使用了也覺得這個 Gem 使用起來很不錯.
    無奈在現有時間下很難將架構整個改成適合 Forverb. 只好留待下次.

    • 資料重覆性 :

      透過 Defer 在處理時,有可能發生比如 A 抓取資料 1 ,並打算寫入,而同時 B 也嘗試寫入(因為此時 A 尚未寫入).
      關於這樣的資料重覆,我們目前能處理的只有在寫入前確認一次,寫入後也再確認一次資料重覆性.
      但關於這問題,應該可以找時間來參考 NoSQL 各 Node 之間的 Consist 處理方式. :)

    • Defer的處理即時性 :

      EM.Defer 簡單講就有點像是 Fork 一個 Thread 去處理某件事.
      所以包含在 EM.Defer{ } 之間的 Code 將不會 Block 住 Main Loop. 根據官方文件對 Defer 的解釋 :

    Note carefully that the code in your deferred operation will be executed on a separate thread from the main EventMachine processing and all other Ruby threads that may exist in your program. Also, multiple deferred operations may be running at once! Therefore, you are responsible for ensuring that your operation code is threadsafe. [Need more explanation and examples.] Don’t write a deferred operation that will block forever. If so, the current implementation will not detect the problem, and the thread will never be returned to the pool. EventMachine limits the number of threads in its pool, so if you do this enough times, your subsequent deferred operations won’t get a chance to run. [We might put in a timer to detect this problem.]

    然而很多時候 Von 在 Defer 時卻不是馬上處理. 當然,有時也會做一做,那個工作就消失了..
    儘管上面文字說明了它們對 Threads 有數量限制.
    而這數字似乎沒有被標示出來? 但 Von 只做了 20 個 Defer… :p
    所以關於這方面的技巧處理,應該要更小心操作才是.

  • 關於跨Server資料傳遞 :

    透過之前開發一套 Web Socket Server 端聊天室的經驗. 讓 Von 很想在下次的 Worker 加入 Channel 和 WebSocket 來讀取 Event .
    如此一來不僅可以簡單的透過 web 傳遞資料,速度會比寫入資料庫,或是 http request 來得快.
    而且維護上也比較容易. 這樣的作法就有點像是實現了一套輕量化的 Beanstalk, AMQP. 只是它使用 WebSokcet .

接下來還有許多工作得處理. 下次來分享一下 Goliath 的 Async 和穩定性.
團隊內部的大家都對它很滿意喔. :)

如果您對這樣的工作挑戰有興趣,我們正在找創業夥伴. 請點我

    • #amqp
    • #beanstalk
    • #channel
    • #concurrent
    • #em.defer
    • #eventmachine
    • #fork process
    • #goliath
    • #thin
    • #thread
    • #twitter
    • #web socket
  • 2 months ago
  • Comments
  • Permalink
  • Share

myNoSQL: Redis Guide: What Each Redis Data Type Should Be Used For

nosql:

Salvatore Sanfilippo offers a very detailed answer to this question on StackOverflow. Just to give you a glimpse.

  • strings:
    • to avoid converting your already encoded data (JSON, HTML)
    • bitmaps and in general random access arrays of bytes
  • lists:
    • when you are likely to touch only the…
  • 2 months ago > nosql
  • 12
  • Comments
  • Permalink
  • Share

Paginating With Riak

nosql:

Alexander Sicular explaining why pure key-value stores require a different approach when an application needs to paginate through result sets:

Riak at its core is a distributed key/value persisted data store that also happens to do a lot of other things. Now break that down. Looking at those words individually we have “distributed”, meaning that your data lives on a number of different machines in your cluster. Good thing, right? Yes. However it also means that no single machine is the canonical reference for all your data. Which in turn means that you need to ask multiple machines for your data and those machines will return data to you when they see fit, ie. not in order. Moving on, we have “key/value”. In regards to the topic at hand, this means that Riak has no insight into any data held within your keys, ie. Riak does not care if your stored json object has an age value in it. Next, we have “persisted”. Riak has no native internal index, meaning Riak will not store on disk the data you send it in any useful way - useful to you at least. Lastly, we have “happens to do a lot of other things.” Thankfully for us, one of those other things is Map/Reduce.

Original title and link: Paginating With Riak (NoSQL database©myNoSQL)

  • 2 months ago > nosql
  • 5
  • Comments
  • Permalink
  • Share

myNoSQL: Bump Chose Basho's Riak for Both Scale and Reliability

nosql:

From the PR announcement about Bump’s usage of Riak1:

To ensure consistent up-time and user engagement, Bump chose Basho’s Riak for both scale and reliability. Riak ensures that the Bump application can be continually fed with information without the worry of a system fail. Bump…

    • #riak
    • #nosql
    • #bump
  • 2 months ago > nosql
  • 3
  • Comments
  • Permalink
  • Share

Custom Logger

ilake:

Ref: http://www.ruby-doc.org/stdlib-1.9.3/libdoc/logger/rdoc/Logger.html

  • 2 months ago > ilake
  • 1
  • Comments
  • Permalink
  • Share

ReFS: The Next Generation File System for Windows

nosql:

Triggered by the last 2 podcasts of John Siracusa1, the last few days I’ve read quite a bit about ZFS and Microsoft’s new file system ReFS. The article I’m linking to contains quite a few interesting bits about ReFS, but the following parts caught my attention:

  1. ReFS is not a log structured file system due to torn writes:

    One of the approaches we considered and rejected was to implement a log structured file system. This approach is unsuitable for the type of general-purpose file system required by Windows. NTFS relies on a journal of transactions to ensure consistency on the disk. That approach updates metadata in-place on the disk and uses a journal on the side to keep track of changes that can be rolled back on errors and during recovery from a power loss. One of the benefits of this approach is that it maintains the metadata layout in place, which can be advantageous for read performance. The main disadvantages of a journaling system are that writes can get randomized and, more importantly, the act of updating the disk can corrupt previously written metadata if power is lost at the time of the write, a problem commonly known as torn write.

  2. ReFS is using B+ trees instead:

    On-disk structures and their manipulation are handled by the on-disk storage engine. This exposes a generic key-value interface, which the layer above leverages to implement files, directories, etc. For its own implementation, the storage engine uses B+ trees exclusively. In fact, we utilize B+ trees as the single common on-disk structure to represent all information on the disk. Trees can be embedded within other trees (a child tree’s root is stored within the row of a parent tree). On the disk, trees can be very large and multi-level or really compact with just a few keys and embedded in another structure. This ensures extreme scalability up and down for all aspects of the file system. Having a single structure significantly simplifies the system and reduces code. The new engine interface includes the notion of “tables” that are enumerable sets of key-value pairs. Most tables have a unique ID (called the object ID) by which they can be referenced. A special object table indexes all such tables in the system.

Even if you are not a file systems expert, this is an interesting read.


  1. Podcasts were a pleasant companion while being sick. ↩

Original title and link: ReFS: The Next Generation File System for Windows (NoSQL database©myNoSQL)

  • 2 months ago > nosql
  • 3
  • Comments
  • Permalink
  • Share

MapReduce and Massively Paralle Processing (MPP): Two Sides of the Big Data

nosql:

Andrew Brust for ZDNet:

But, for a variety of reasons, MPP and MapReduce are used in rather different scenarios. You will find MPP employed in high-end data warehousing appliances. […] MPP gets used on expensive, specialized hardware tuned for CPU, storage and network performance. MapReduce and Hadoop find themselves deployed to clusters of commodity servers that in turn use commodity disks. The commodity nature of typical Hadoop hardware (and the free nature of Hadoop software) means that clusters can grow as data volumes do, whereas MPP products are bound by the cost of, and finite hardware in, the appliance and the relative high cost of the software. […] MPP and MapReduce are separated by more than just hardware. MapReduce’s native control mechanism is Java code (to implement the Map and Reduce logic), whereas MPP products are queried with SQL (Structured Query Language). […] Nonetheless, Hadoop is natively controlled through imperative code while MPP appliances are queried though declarative query. In a great many cases, SQL is easier and more productive than is writing MapReduce jobs, and database professionals with the SQL skill set are more plentiful and less costly than Hadoop specialists.

I totally agree with Andrew Brust that none of these are good reasons for these platforms to remain separate. Actually when analyzing the importance of the Teradata (MPP) and Hortonworks (Hadoop) partnership, I wrote:

Depending on the level of integration the two team will pull together, this partnership might result in one of the most complete and powerful structured and unstructured data warehouse and analytics platform.

This very same thing could be said about any platform that would offer a viable, fully integrated, cost effective, distributed, structured and unstructured data warehouse or analytics platform. MPP and MapReduce do not represent different sides of the Big Data, but rather complementary approaches for Big Data.

Original title and link: MapReduce and Massively Paralle Processing (MPP): Two Sides of the Big Data (NoSQL database©myNoSQL)

  • 2 months ago > nosql
  • 6
  • Comments
  • Permalink
  • Share

關於使用Goliath - Concurrent & Synchronize

很久沒寫點技術相關的文章了. 最近都在搞些底層的東西.
主要就是Concurrent & Synchronize的部份.

雖然ruby的thread一直都像個玩笑,但1.9以及jruby之後,thread的問題好很多.
而且也有light weight的 fiber可以使用. :)
在Ruby之中要處理這類的事,第一個直接想到的大概就是使用EventMachine.
然而EM說穿了也就只是一個加強版的For Loop.
所以EM::Synchronize就是一套讓我們可以做並行處理的事.
只是光只有這樣還是不夠,再加上Fiber,就可以把效能和並行處理的事更提升.

所以在EM::Synchronize裡寫起來就像 :

EM.synchrony do
  EM::Synchrony::FiberIterator.new(jobs, jobConcurrentLimit).each do |your_job|
      do_your_works_here
  end
end

簡單兩個block包住,就可以做了.

當然,就算是這樣寫也還是有點不夠簡潔和好處理.

所以,Goliath就是一套設計好的framework讓我們處理這些事.
一般人大概拿它來幹麻呢? 我想得到的大概就是當API Server吧. :)
而我目前也是如此使用…

透過http_router或rack_mapper, 可以讓Goliath用起來就像sinatra. 更容易撰寫.

以Goliath來說,數據是500+ req/s. 自己實際測一下更高.
而自己用EM寫,數字是比Goliath更高.. 但是很難維護… 也不那麼穩定.. :p

    • #api serve
    • #ruby
    • #concurrent
    • #synchronize
    • #goliath
    • #event machine
    • #rack mapper
    • #http router
  • 3 months ago
  • 1
  • Comments
  • Permalink
  • Share

超威File(Image) Uploader : Dragonfly

    • #gem
    • #ruby
    • #uploader
    • #dragonfly
    • #paperclip
    • #carrierwave
  • 4 months ago
  • 44
  • Comments
  • Permalink
  • Share

EventMachine - How to write a better evented app

EventMachine

Reactor

使用事件迴圈重覆執行某件事. 在事件迴圈(Reactor)之中記得React行為.
運作原理 : Reactor取代Ruby的執行緒並重覆迴圈,當事件觸發時則呼叫Callback.

並且,React記得不要被(callback)Block住.
如何寫出不Block的事件?

  • 不要有sleep(n)
  • 不要有過長迴圈(100000.times)
  • 不要有過長的I/O處理(mysql queries)
  • 不要有可能無限或過長的迴圈 (while !conditions)


Fiber

輕量級的同步處理
透過Fiber,能讓EventMachine擺脫Callback地獄之中.
當然,我們也可以透過EM-Synchrony做到同樣的事情.

重要功能

Deferrables

功用 : 讓我們可以撰寫Event Driven的Code. 讓我們在任何物件成功或失敗時做Callback.
如此,我們可以spawn一個執行緒.

Channel

功用 : Pub/Sub 所有Subscribe會在Main Loop被接收.

Queue

功用 : 能跨執行緒,而所有由執行緒丟出的push都會在Main Loop被線性接收.

Iterator

功用 : 讓事件能被同步執行. 透過map或each,我們可以把多個事件丟進去. 比如

EM::Iterator.new(array,10).map(proc{|ary,iter| something_long() })



撰寫要點

  • Main Loop應該要越快越好
  • I/O相關的事應於Main Loop處理
  • 任何可能久候的事件都應使用 EM.defer丟到Background
  • 當有需要大量資料寫入/傳輸,會將Main Loop降速時,可以使用EM.next_tick
  • 當Deferrable物件時,可以設定timeout跳到:failed避免過久的等待. 或用 set_deferred_status(nil)重設callback
  • I/O bound應使用event driven方式處理,所謂event driven前題就是沒有CPU bound

適用於EM的資源 -

  • mysql2
  • Activerecord
  • em-http-request
  • em-memcached
  • em-mongo
  • mongoid
  • em-jack
  • em-synchrony
  • goliath

有趣的EM Protocol -

  • Redis
  • MongoDB
  • CouchDB
  • Beanstalk
  • IRC
  • Thrift
  • Solr
  • SSH
  • XML Push Parser
  • Memcache
  • XMPP
  • DNS

延伸閱讀 -
igvita : Untangling Evented Code with Ruby Fibers
Dr Nic Williams : Threading versus Evented
Mike Perham : Scaling Ruby with Actors, or How I Learned to Stop Worrying and Love Threads
Jonathan Weiss : Advanced Eventmachine
EventMachine Protocol Implementation

    • #event machine
    • #fiber
    • #reactor
    • #node.js
    • #node js
    • #em defer
    • #main loop
    • #igvita
    • #nic williams
    • #mike perham
    • #ruby conf
    • #thread
  • 5 months ago
  • 19
  • Comments
  • Permalink
  • Share
← Newer • Older →
Page 1 of 5

Von技術

About

Von on rails. Ruby, Rails, Neo4j, Riak ...etc

Me, Elsewhere

  • @vonstark32 on Twitter
  • Facebook Profile

Twitter

loading tweets…

  • RSS
  • Random
  • Archive
  • Ask me anything
  • Mobile

VonStark. Effector Theme by Carlo Franco.

Powered by Tumblr